V2ANS2.TEX[TEX,DEK]4 - www.SailDart.org

perm filename V2ANS2.TEX[TEX,DEK]4 blob sn#424404 filedate 1979-03-10 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00036 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00005 00002	\input acphdr % Answer pages (double-check position of figures)
C00006 00003	%folio 760 galley 7b (C) Addison-Wesley 1978	*
C00011 00004	%folio 761 galley 8 (C) Addison-Wesley 1978	*
C00021 00005	%folio 763 galley 1 (C) Addison-Wesley 1978 	*
C00033 00006	%folio 763a galley 2 Much unreadable (C) Addison-Wesley 1978 	*
C00044 00007	%folio 768 galley 3a (C) Addison-Wesley 1978	*
C00051 00008	%folio 769 galley 3b (C) Addison-Wesley 1978	*
C00058 00009	%folio 770 galley 4 (C) Addison-Wesley 1978	*
C00067 00010	%folio 771 galley 5 (C) Addison-Wesley 1978	*
C00080 00011	%folio 776 galley 6a (C) Addison-Wesley 1978	*
C00090 00012	%folio 777 galley 6b (C) Addison-Wesley 1978	*
C00095 00013	%folio 777 galley 7 Bad beginning. (C) Addison-Wesley 1978	*
C00107 00014	%folio 781 galley 8 (C) Addison-Wesley 1978	*
C00117 00015	%folio 784 galley 9 (C) Addison-Wesley 1978	*
C00136 00016	%folio 790 galley 10 (C) Addison-Wesley 1978	*
C00151 00017	%folio 794 galley 11a (C) Addison-Wesley 1978	*
C00174 00018	%folio 795 galley 11b (C) Addison-Wesley 1978	*
C00178 00019	%folio 796 galley 11c (C) Addison-Wesley 1978	*
C00185 00020	%folio 797 galley 12 (C) Addison-Wesley 1978	*
C00195 00021	%folio 800 galley 13 (C) Addison-Wesley 1978	*
C00203 00022	%folio 802 galley 1a (C) Addison-Wesley 1978	*
C00216 00023	%folio 804 galley 1b (C) Addison-Wesley 1978	*
C00220 00024	%folio 805 galley 2 (C) Addison-Wesley 1978	*
C00237 00025	%folio 810 galley 3 (C) Addison-Wesley 1978	*
C00255 00026	%folio 818 galley 4a (C) Addison-Wesley 1978	*
C00273 00027	%folio 819 galley 4b (C) Addison-Wesley 1978	*
C00283 00028	%folio 821 galley 5 (C) Addison-Wesley 1978	*
C00298 00029	%folio 824 galley 6a (C) Addison-Wesley 1978	*
C00310 00030	%folio 826 galley 6b (C) Addison-Wesley 1978	*
C00317 00031	%folio 828 galley 7 (C) Addison-Wesley 1978	*
C00333 00032	%folio 832 galley 8 (C) Addison-Wesley 1978	*
C00350 00033	%folio 835 galley 9a (C) Addison-Wesley 1978	*
C00396 00034	%folio 840 galley 9b (C) Addison-Wesley 1978	*
C00401 00035	%folio 842 galley 10a (C) Addison-Wesley 1978	*
C00411 00036	\vfill\end
C00412 ENDMK
C⊗;
\input acphdr % Answer pages (double-check position of figures)
\runninglefthead{ANSWERS TO EXERCISES}
\titlepage\setcount00
\null
\vfill
\tenpoint
\ctrline{ANSWER PAGES for THE ART OF COMPUTER PROGRAMMING}
\ctrline{(Volume 2)}
\ctrline{(second half of the answers)}
\ctrline{$\copyright$ 1978 Addison--Wesley Publishing Company, Inc.}
\vfill
\ninepoint
\runningrighthead{ANSWERS TO EXERCISES}
\section{4.3.2}
\penalty-9999
\setcount0 562
%folio 760 galley 7b (C) Addison-Wesley 1978	*

\ansbegin{4.3.2}

\ansno 1. The solution is unique since
$7 \cdot 11 \cdot 13 = 1001$. The ``constructive'' proof of Theorem
C tells us that the answer is $\biglp (11 \cdot 13)↑6 + 6\cdot(7 \cdot
13)↑{10} + 5 \cdot (7 \cdot 11)↑{12}\bigrp \mod 1001$. But this answer
is perhaps not explicit enough! By (23) we have $v↓1 = 1$, $v↓2
= (6 - 1) \cdot 8 \mod 11 = 7$, $v↓3 = \biglp (5 - 1) \cdot 2
- 7\bigrp \cdot 6 \mod 13 = 6$, so $u = 6 \cdot 7 \cdot 11 + 7
\cdot 7 + 1 = 512$.

\ansno 2. No. There is at most one such $u$; the additional condition
$u↓1≡\cdots≡u↓r\modulo 1$ is necessary and sufficient, and it follows that
such a generalization is not very interesting.

\ansno 3. $u ≡ u↓i\modulo{m↓i}$ implies that $u ≡ u↓i$ $\biglp$modulo
$\gcd(m↓i, m↓j)\bigrp $, so the condition $u↓i ≡ u↓j$ $\biglp$modulo
$\gcd(m↓i, m↓j)\bigrp$ must surely hold if there is a
solution. Furthermore if $u ≡ v\modulo {m↓j}$ for all $j$,
then $u - v$ is a multiple of $\lcm(m↓1, \ldotss , m↓r) = m$;
hence there is at most one solution.

The proof can now be completed in a nonconstructive
manner by counting the number of different $r$-tuples $(u↓1, \ldotss
, u↓r)$ satisfying the conditions $0 ≤ u↓j < m↓j$ and $u↓i ≡
u↓j$ $\biglp$modulo $\gcd(m↓i, m↓j)\bigrp$. If this number is
$m$, there must be a solution since \hbox{$(u \mod m↓1, \ldotss , u
\mod m↓r)$} takes on $m$ distinct values as $u$ goes from $a$
to $a + m$. Assume that $u↓1, \ldotss , u↓{r-1}$ have been chosen
satisfying the given conditions; we must now pick $u↓r ≡ u↓j$
$\biglp$modulo $\gcd(m↓j, m↓r)\bigrp$ for $1 ≤ j < r$, and by the
generalized Chinese Remainder Theorem for $r - 1$ elements there
are
$$\eqalign{m↓r/\!\lcm\biglp\gcd(m↓1, m↓r), \ldotss ,\gcd(m↓{r-1}, m↓r)\bigrp
⊗ = m↓r/\!\gcd\biglp\lcm(m↓1, \ldotss , m↓{r-1}), m↓r\bigrp\cr
⊗= \lcm(m↓1, \ldotss , m↓r)/\!\lcm(m↓1, \ldotss , m↓{r-1})\cr}$$
ways to do this.\xskip [This proof is based on identities
(10), (11), (12), and (14) of Section 4.5.2.]

A constructive proof [A. S. Fraenkel, {\sl Proc.\ Amer.\ Math.\ Soc.\
\bf 15} (1963), 790--791] generalizing (24) can be given
as follows. Let $M↓j = \lcm(m↓1, \ldotss , m↓j)$; we wish to
find $u = v↓rM↓{r-1} + \cdots + v↓2M↓1 + v↓1$, where $0 ≤ v↓j
< M↓j/M↓{j-1}$. Assume that $v↓1$, $\ldotss$, $v↓{j-1}$ have already
been determined; then we must solve the congruence
$$v↓jM↓{j-1} + v↓{j-1}M↓{j-2} + \cdots + v↓1 ≡ u↓j\modulo{m↓j}.$$
Here $v↓{j-1}M↓{j-2} + \cdots + v↓1 ≡ u↓i ≡ u↓j$
$\biglp$modulo $\gcd(m↓i, m↓j)\bigrp$ for $i < j$ by hypothesis,
so $c = u↓j - (v↓{j-1}M↓{j-2} + \cdots + v↓1)$ is a multiple of
$$\lcm\biglp\gcd(m↓1, m↓j), \ldotss ,\gcd(m↓{j-1},
m↓j)\bigrp = \gcd(M↓{j-1}, m↓j) = d↓j.$$
We therefore must solve $v↓jM↓{j-1} ≡ c\modulo
{m↓j}$. By Euclid's algorithm there is a number $c↓j$ such that
$c↓jM↓{j-1} ≡ d↓j\modulo {m↓j}$; hence we may take
$$v↓j = (c↓j\,c)/d↓j \mod (m↓j/d↓j).$$
Note that, as in the nonconstructive proof, we
have $m↓j/d↓j = M↓j/M↓{j-1}$.
%folio 761 galley 8 (C) Addison-Wesley 1978	*

\ansno 4. (After $m↓4 = 91 = 7 \cdot 13$, we have
used up all products of two or more odd primes that can be less
than 100, so $m↓5$, $\ldots$ must all be prime.)
$$\baselineskip 14pt
\vbox{\halign{$m↓{#}\hfill=\null$⊗#,\qquad
⊗$m↓{#}\hfill=\null$⊗#,\qquad
⊗$m↓{#}\hfill=\null$⊗#,\qquad
⊗$m↓{#}\hfill=\null$⊗#,\qquad
⊗$m↓{#}\hfill=\null$⊗#,\cr
7⊗79⊗8⊗73⊗9⊗71⊗10⊗67⊗11⊗61\cr
12⊗59⊗13⊗53⊗14⊗47⊗15⊗43⊗16⊗41\cr
17⊗37⊗18⊗31⊗19⊗29⊗20⊗23⊗21⊗17\cr}}$$
and then we are stuck ($m↓{22} = 1$ does no good).

\ansno 5. No. The obvious upper bound,
$$3↑45↑27↑211↑1 \cdots = \prod ↓{\scriptstyle p\,\hbox{\:e odd}\atop
\scriptstyle p\,\hbox{\:e prime}}p↑{\lfloor\log↓p 100\rfloor},$$
is attained if we choose $m↓1
= 3↑4$, $m↓2 = 5↑2$, etc.\xskip (It is more difficult, however, to maximize
$m↓1 \ldotsm m↓r$ when $r$ is fixed, or to maximize $m↓1 + \cdots
+ m↓r$ as we would attempt to do when using moduli $2↑{m↓j}- 1$.)

\ansno 6. (a)\9 If $e = f + kg$, then $2↑e = 2↑f(2↑g)↑k
≡ 2↑f \cdot 1↑k \modulo {2↑g - 1}$. So if $2↑e ≡ 2↑f\modulo
{2↑g - 1}$, we have $2↑{e\mod g} ≡ 2↑{f\mod g} \modulo{2↑g -
1}$; and since the latter quantities lie between zero and $2↑g
- 1$ we must have $e \mod g = f \mod g$.\xskip (b) By part (a),
$(1 + 2↑d + \cdots + 2↑{(c-1)d}) \cdot (2↑e - 1) ≡ (1 + 2↑d
+ \cdots + 2↑{(c-1)d}) \cdot (2↑d - 1) = 2↑{cd} - 1 ≡ 2↑{ce} - 1 ≡ 2↑1 - 1 = 1
\modulo{2↑f - 1}$.

\ansno 7. $\biglp u↓j - \biglp v↓1 + m↓1(v↓2 + m↓2(v↓3
+ \cdots + m↓{j-2}v↓{j-1})\ldotsm)\bigrp\bigrp\,c↓{1j}\ldotsm c↓{(j-1)j}$

\penalty 1000
\vbox{\halign{\hbox to size{\hskip 40pt$#$}\cr
=(u↓j - v↓1)c↓{1j} \ldotsm
c↓{(j-1)j} - m↓1v↓2c↓1 \ldotsm c↓{(j-1)j} - \cdots\hfill\cr
\hfill \null-m↓1 \ldotsm m↓{j-2}v↓{j-1}c↓{1j} \ldotsm c↓{(j-1)j}\qquad\cr
\noalign{\vskip 3pt}
≡(u↓j - v↓1)c↓{1j} \ldotsm c↓{(j-1)j} - v↓2c↓{2j} \ldotsm c↓{(j-1)j}
- \cdots - v↓{j-1}c↓{(j-1)j}\hfill\cr
\noalign{\vskip 3pt}
= \biglp \ldotsm ((u↓j - v↓1)c↓{1j} - v↓2)c↓{2j} - \cdots
- v↓{j-1}\bigrp\,c↓{(j-1)j}\modulo {m↓j}.\hfill\cr}}

\yskip This method of rewriting the formulas uses the
same number of arithmetic operations and fewer constants; but
the number of constants is fewer only if we order the moduli
so that $m↓1 < m↓2 < \cdots < m↓r$, otherwise we would need
a table of $m↓i \mod m↓j$. This ordering of the moduli might
seem to require more computation than if we made $m↓1$ the largest,
$m↓2$ the next largest, etc., since there are many more operations
to be done modulo $m↓r$ than modulo $m↓1$; but since $v↓j$ can
be as large as $m↓j - 1$, we are better off with $m↓1 < m↓2
< \cdots < m↓r$ in (23) also. So this idea appears to be preferable
to the formulas in the text, although the formulas in the text
are advantageous when the moduli have the form (14), as shown
in Section 4.3.3.

\ansno 8. $m↓{j-1} \ldotsm m↓1v↓j ≡ m↓{j-1} \ldotsm m↓1\,\biglp
\ldotsm((u↓j - v↓1)c↓{1j} - v↓2)c↓{2j} - \cdots - v↓{j-1}\bigrp\,
c↓{(j-1)j} ≡ m↓{j-2} \ldotsm m↓1\,\biglp \ldotsm(u↓j - v↓1)c↓{1j}
- \cdots - v↓{j-2}\bigrp\,c↓{(j-2)j} - v↓{j-1}m↓{j-2} \ldotsm
m↓1 ≡ \cdots ≡ u↓j - v↓1 - v↓2m↓1 - \cdots - v↓{j-1}m↓{j-2}
\ldotsm m↓1\modulo {m↓j}$.

\ansno 9. $u↓r ← \biglp (\ldotsm (v↓rm↓{r-1} + v↓{r-1})m↓{r-2}
+ \cdotss)m↓1 + v↓1\bigrp \mod m↓r$, \ $\ldotss $,

\penalty1000
\rjustline{$u↓2 ← (v↓2m↓1 + v↓1)\mod m↓2$, \ $u↓1 ← v↓1 \mod m↓1$.}
\yskip $\biglp$The computation should be done in this order,
if we want to let $u↓j$ and $v↓j$ share the same memory locations,
as they can in (23).$\bigrp$

\ansno 10. If we redefine the ``mod'' operator so that it produces
residues in the symmetrical range, the basic formulas (2), (3),
(4) for arithmetic and (23), (24) for conversion remain the
same, and the number $u$ in (24) lies in the desired range (10).
$\biglp$Here (24) is a {\sl balanced mixed-radix} notation, generalizing
``balanced ternary'' notation.$\bigrp$\xskip The comparison of two numbers
may still be done from left to right, in the simple manner described
in the text. Furthermore, it is possible to retain the value
$u↓j$ in a single computer word, if we have signed magnitude
representation within the computer, even if $m↓j$ is almost
twice the word size. But the arithmetic operations analogous
to (11) and (12) are more difficult, so it appears that on most
computers this idea would result in slightly slower operation.

\ansno 11. Multiply by
$${m + 1\over 2} = \left({m↓1 + 1\over 2}, \ldotss , {m↓r
+ 1\over 2}\right).$$
Note that $2t \cdot {m + 1\over 2} ≡ t\modulo m$.
In general if $v$ is relatively prime to $m$, then
we can find (by Euclid's algorithm) a number $v↑\prime = (v↑{\prime}↓{1},
\ldotss , v↑{\prime}↓{\hskip-.8333pt r})$ such that $vv↑\prime ≡ 1\modulo
m$; and then if $u$ is known to be a multiple of $v$ we have
$u/v = uv↑\prime $, where the latter is computed with modular
multiplication. When $v$ is not relatively prime to $m$, division
is much more difficult.

\ansno 12. Obvious from (11), if we replace $m↓j$ by $m$.

\ansno 13. (a)\9 $x↑2 - x = (x - 1)x ≡ 0\modulo{10↑n}$ is equivalent
to $(x - 1)x ≡ 0 \modulo {p↑n}$ for $p = 2$ and 5. Either $x$
or $x - 1$ must be a multiple of $p$, and then the other is
relatively prime to $p↑n$; so either $x$ or $x - 1$ must be
a multiple of $p↑n$. If $x \mod 2↑n = x \mod 5↑n = 0$ or 1,
we must have $x \mod 10↑n = 0$ or 1; hence
automorphs have $x \mod 2↑n ≠ x \mod 5↑n$.\xskip
(b) If $x = qp↑n + r$, where $r = 0$ or 1, then $r ≡ r↑2 ≡ r↑3$,
so $3x↑2 - 2x↑3 ≡ (6qp↑nr + 3r) - (6qp↑nr + 2r) ≡ r\modulo
{p↑{2n}}$.\xskip (c) Let $c↑\prime = \biglp 3(cx)↑2 - 2(cx)↑3\bigrp /x↑2
= 3c↑2 - 2c↑3x$.

{\sl Note:} Since the last $k$ digits of
an $n$-digit automorph form a $k$-digit automorph, it makes
sense to speak of the two $∞$-digit automorphs, $x$ and $1 - x$,
which are 10-adic numbers (cf.\ exercise 4.1--31). The set of
10-adic numbers is equivalent under modular arithmetic to the set
of ordered pairs $(u↓1,u↓2)$, where $u↓1$ is a 2-adic number and $u↓2$
is a 5-adic number.
%folio 763 galley 1 (C) Addison-Wesley 1978 	*
\ansbegin{4.3.3}

{\baselineskip0pt\lineskip0pt\def\\{\lower 2.5pt\vbox to 11pt{}}
\anskip\halign{\hbox to 19pt{#}⊗$\rt{#}$\qquad⊗$\rt{#}$\qquad⊗$\rt{#}$\qquad
⊗$\rt{#}$\cr
\bf1.\ \lower 4.5pt\vbox to 13pt{}⊗27 \times47:
⊗18 \times 42:⊗09 \times 05:⊗2718 \times 4742:\cr
\\⊗08\9\9⊗04\9\9⊗00\9\9⊗1269\9\9\9\9\cr
\\⊗08\9⊗04\9⊗00\9⊗1269\9\9\cr
\\⊗-15\9⊗14\9⊗-45\9⊗-0045\9\9\cr
\\⊗49\9⊗16\9⊗45\9⊗0756\9\9\cr
\\⊗49⊗16⊗45⊗0756\cr
\lower 1pt\vbox to 2.4pt{}⊗\vbox{\hrule width 18pt}⊗\vbox{\hrule width 18pt}⊗
\vbox{\hrule width 18pt}⊗\vbox{\hrule width 36pt}\cr
\lower 5.5pt\vbox to 14pt{}⊗1269⊗0756⊗0045⊗12888756\cr}}

\ansno 2. $\sqrt{Q + \lfloor \sqrt Q\rfloor
} ≤ \sqrt{Q + \sqrt Q} < 
\sqrt{Q + 2\sqrt Q + 1} = \sqrt{Q} + 1$, so $\lfloor
\sqrt{Q + R}\rfloor ≤ \lfloor \sqrt{Q}\rfloor + 1$.

\ansno 3. When $k ≤ 2$, the result is
true, so assume that $k > 2$. Let $q↓k = 2↑{Q↓k}$, $r↓k =
2↑{R↓k}$, so that $R↓k = \lfloor \sqrt{Q↓k}\rfloor$
and $Q↓k = Q↓{k-1} + R↓{k-1}$. We must show that $1 + (R↓k +
1)2↑{R↓k} ≤ 2↑{Q↓{k-1}}$; this inequality
isn't close at all, one way is to observe that $1 + (R↓k + 1)2↑{R↓k}
≤ 1 + 2↑{2R↓k}$ and $2R↓k < Q↓{k-1}$ when $k > 2$.\xskip
(The fact that $2R↓k < Q↓{k-1}$ is readily proved by induction
since $R↓{k+1} - R↓k ≤ 1$ and $Q↓k - Q↓{k-1} ≥ 2$.)

\ansno 4. For $j = 1$, $\ldotss$, $r$, calculate $U↓e(j↑2)$,
$jU↓o(j↑2)$, $V↓e(j↑2)$, $jV↓o(j↑2)$; and by recursively calling
the multiplication algorithm, calculate
$$\eqalign{W(j) ⊗= \biglp U↓e(j↑2) + jU↓o(j↑2)\bigrp
\biglp V↓e(j↑2) + jV↓o(j↑2)\bigrp ,\cr
W(-j) ⊗= \biglp U↓e(j↑2) - jU↓o(j↑2)\bigrp
\biglp V↓e(j↑2) - jV↓o(j↑2)\bigrp ;\cr}$$
and then we have $W↓e(j↑2) = {1\over 2}\biglp W(j) +
W(-j)\bigrp$, $W↓o(j↑2) = {1\over 2}\biglp W(j) - W(-j)\bigrp
$. Also calculate $W↓e(0) = U(0)V(0)$. Now construct difference
tables for $W↓e$ and $W↓o$, which are polynomials whose respective
degrees are $r$ and $r - 1$.

This method reduces the size of the numbers being
handled, and reduces the number of additions and multiplications.
Its only disadvantage is a longer program (since the control
is somewhat more complex, and some of the calculations must
be done with signed numbers).

Another possibility would perhaps be
to evaluate $W↓e$ and $W↓o$ at $1↑2$, $2↑2$, $4↑2$, $\ldotss$, $(2↑r)↑2$;
although the numbers involved are larger, the calculations are
faster, since all multiplications are replaced by shifting and
all divisions are by binary numbers of the form $2↑j(2↑k - 1)$.\xskip
(Simple procedures are available for dividing by such numbers.)

\ansno 5. Start the $q, r$ sequences out with $q↓0$ and
$q↓1$ large enough so that the inequality in exercise 3 is valid.
Then we will find in the formulas analogous to those preceding
Theorem C that $\eta ↓1 → 0$ and $\eta ↓2 = \biglp 1 + 1/(2r↓k)\bigrp 2↑{1
+\sqrt{2Q↓k}-\sqrt{2Q↓{k+1}}}\,(Q↓k/Q↓{k+1})$. The factor $Q↓k/Q↓{k+1} → 1$
as $k → ∞$, so we can ignore it if we want to show that $\eta ↓2
< 1 - ε$ for all large $k$. Now $\sqrt{2Q↓{k+1}} = 
\sqrt{2Q↓k + 2\lceil\sqrt{ 2Q↓k}\,\rceil + 2} ≥ 
\sqrt{(2Q↓k + 2\sqrt{2Q↓k} + 1) + 1} ≥ \sqrt{2Q↓k} + 1 + 1/(3R↓k)$. Hence $\eta
↓2 ≤ \biglp1 + 1/(2r↓k)\bigrp2↑{-1/(3R↓k)}$, and $\lg\eta ↓2 < 0$ for
large enough $k$.

{\sl Note:} Algorithm C can also be modified to
define a sequence $q↓0$, $q↓1$, $\ldots$ of a similar type that
is based on $n$, so that $n \approx q↓k + q↓{k+1}$ after step
C1. This modification leads to the estimate (19).

\ansno 6. Any common divisor of
$6q + d↓1$ and $6q + d↓2$ must also divide their difference
$d↓2 - d↓1$. The $6\choose2$ differences are 2, 3, 4, 6,
8, 1, 2, 4, 6, 1, 3, 5, 2, 4, 2, so we must only show that at
most one of the given numbers is divisible by each of the primes
2, 3, 5. Clearly only $6q + 2$ is even, and only $6q + 3$ is
a multiple of 3; and there is at most one multiple of 5, since
$q↓k \neqv 3\modulo 5$.

\ansno 7. $t↓k ≤ 6t↓{k-1} + ck3↑k$ for some constant $c$; so
$t↓k/6↑k ≤ t↓{k-1}/6↑{k-1} + ck/2↑k ≤ t↓0 + c \sum ↓{j≥1}(j/2↑j)
= M$. Thus $t↓k ≤ M \cdot 6↑k$.

\ansno 8. Let $2↑k$ be the smallest power of 2 that exceeds $2K$. Set $a↓t←\omega
↑{-t↑2/2}u↓t$ and $b↓t←\omega↑{(2K-2-t)↑2/2}$, where $u↓t=0$ for $t≥K$. We want
to calculate the convolutions $c↓r=\sum↓{0≤j≤r}a↓jb↓{r-j}$ for $r=2K-2-s$, when
$0≤s<K$. The convolutions can be found by using three fast Fourier transformations
of order $2↑k$, as in the text's multiplication procedure.\xskip[The origin of
this trick is unknown.]

\ansno 9. $\s u↓s=\A u↓{(qs)\mod K}$. In particular, if $q=-1$ we get $\A u↓{(-
r)\mod K}$, which avoids shuffling when computing inverse transforms.

\ansno 10. $A↑{[j]}(s↓{k-1},\ldotss,s↓{k-j},t↓{k-j-1},\ldotss,t↓0)=\hskip-.6pt
\sum↓{0≤t↓{k-1},
\ldotss,t↓{k-j}≤1}\omega↑{(s↓0\ldotsm s↓{k-1})↓2\cdot(t↓{k-1}\ldotsm t↓{k-j}0\ldotsm
0)↓2}\*\biglp\sum↓p\omega↑{tp}u↓p\bigrp\biglp\sum↓q\omega↑{tq}v↓q\bigrp=
\sum↓{p,q}u↓pv↓qS(p,q)$, where $S(p,q)=0$ or $2↑j$. We have $S(p,q)=2↑j$
for exactly $2↑{2k}/2↑j$ values of $p$ and $q$.

\ansno 11. An automaton cannot have $z↓2 = 1$ until it has $c
≥ 2$, and this occurs first for $M↓j$ at time $3j - 1$. It follows
that $M↓j$ cannot have $z↓2z↓1z↓0 ≠ 000$ until time $3(j - 1)$.
Furthermore, if $M↓j$ has $z↓0 ≠ 0$ at time $t$, we cannot change
this to $z↓0 = 0$ without affecting the output; but the output
cannot be affected by this value of $z↓0$ until at least time
$t + j - 1$, so we must have $t + j - 1 ≤ 2n$. Since the first
argument we gave proves that $3(j - 1) ≤ t$, we must have $4(j
- 1) ≤ 2n$, that is, $j - 1 ≤ n/2$, i.e., $j ≤ \lfloor n/2\rfloor
+ 1$. This is the best possible bound,
since the inputs $u = v = 2↑n - 1$ require the use of $M↓j$
for all $j ≤ \lfloor n/2\rfloor + 1$.\xskip (For example, note from
Table 1 that $M↓2$ is needed to multiply two-bit numbers, at
time 3.)

\ansno 12. We can ``sweep through'' $K$ lists of \MIX-like instructions, executing
the first instruction on each list, in $O\biglp K+(N\log N)↑2\bigrp$ steps as
follows:\xskip(1) A radix list sort (Section 5.2.5) will group together all
identical instructions, in time $O(K+M)$.\xskip(2) Each set 
\eject % good place to break page (March 7, 1979)
of $j$ identical
instructions can be performed in $O(\log N)↑2+O(j)$ steps, and there are $O(N↑2)$
sets.
A bounded number of sweeps will finish all the lists. The remaining details are
straightforward; for example, arithmetic operations can be simulated by converting
$p$ and $q$ to binary.\xskip[To appear.]

\ansno 13. If it takes $T(n)$ steps to multiply
$n$-bit numbers, we can accomplish the multiplication of $m$-bit
by $n$-bit by breaking the $n$-bit number into $\lceil n/m\rceil$
$m$-bit groups, using $\lceil n/m\rceil T(m) + O(n + m)$ operations.
The results of this section therefore give an estimated running
time of $O(n\log m\log\log m)$ on Turing machines, or $O(n\log m)$ on machines with
random access to words of bounded size, or $O(n)$ on pointer machines.
%folio 763a galley 2 Much unreadable (C) Addison-Wesley 1978 	*
\ansbegin{4.4}

\ansno 1. We compute $\biglp \ldotsm (a↓mb↓{m-1}
+ a↓{m-2})\,b↓{m-2} +\cdots + a↓1\bigrp \,b↓1 + a↓0$
by adding and multiplying in the $B↓j$ system.
$$\vbox{\halign{#\hfill⊗\hbox to 50pt{\hfill#\hfill}⊗\!
\hbox to 50pt{\hfill#\hfill}⊗\!
\hbox to 50pt{\hfill#\hfill}⊗\!
\hbox to 50pt{\hfill#\hfill}⊗\!
\hbox to 50pt{\hfill#\hfill}\cr
⊗T.⊗$=20($cwt.⊗$=8($st.⊗$=14($lb.⊗$=16$ oz.)))\cr
\noalign{\vskip 2pt}
Start with zero⊗0⊗0⊗0⊗\90⊗\90\cr
Add 3⊗0⊗0⊗0⊗\90⊗\93\cr
Multiply by 24⊗0⊗0⊗0⊗\94⊗\98\cr
Add 9⊗0⊗0⊗0⊗\95⊗\91\cr
Multiply by 60⊗0⊗2⊗5⊗\99⊗12\cr
Add 12⊗0⊗2⊗5⊗10⊗\98\cr
Multiply by 60⊗8⊗3⊗1⊗\90⊗\90\cr
Add 37⊗8⊗3⊗1⊗\92⊗\95\cr}}$$
(Addition and multiplication by a constant in a mixed-radix system are readily
done using a simple generalization of the usual carry rule; cf.\ exercise 4.3.1--9.)

\ansno 2. We compute $\lfloor u/B↓0\rfloor$, $\lfloor\lfloor
 u/B↓0\rfloor/B↓1\rfloor$, etc., and the remainders are $A↓0$, $A↓1$, etc.
The division is done in the $b↓j$ system.
$$\vbox{\halign{#\hfill⊗\hbox to 50pt{\hfill#\hfill}⊗\!
\hbox to 50pt{\hfill#\hfill}⊗\!
\hbox to 50pt{\hfill#\hfill}⊗\!
\hbox to 50pt{\hfill#\hfill}⊗\!
#\hfill\cr
⊗d.⊗$=24($h.⊗$=60($m.⊗$=60$ s.))\cr
\noalign{\vskip 2pt}
Start with $u$⊗3⊗9⊗12⊗37\cr
Divide by 16⊗0⊗5⊗\94⊗32⊗Remainder$\null=5$\cr
Divide by 14⊗0⊗0⊗21⊗45⊗Remainder$\null=2$\cr
Divide by 8⊗0⊗0⊗\92⊗43⊗Remainder$\null=1$\cr
Divide by 20⊗0⊗0⊗\90⊗\98⊗Remainder$\null=3$\cr
Divide by $∞$⊗0⊗0⊗\90⊗\90⊗Remainder$\null=8$\cr}}$$
{\sl Answer:} 8 T. 3 cwt. 1 st. 2 lb. 5 oz.

\ansno 3. The following procedure due to G. L. Steele Jr.\ and Jon L. White
generalizes Taranto's algorithm for $B=2$ originally published in
{\sl CACM \bf 2}, 7 (July 1959), 27.

\algstep A1. [Initialize.] Set $M←0$, $U↓0←0$.

\algstep A2. [Done?] If $u<ε$ or $u>1-ε$, go to step A4.\xskip (Otherwise no
$M$-place fraction will satisfy the given conditions.)

\algstep A3. [Transform.] Set $M←M+1$, $U↓{-M}←\lfloor Bu\rfloor$,
$u←Bu\mod1$, $ε←Bε$, and return to A2.\xskip (This transformation returns us to
essentially the same state we were in before: the remaining problem is to
convert $u$ to $U$ with fewest radix-$B$ places so that $|U-u|<ε$. Note,
however, that $ε$ may now be $≥1$; in this case we could go immediately to step
A4 instead of storing the new value of $ε$.)

\algstep A4. [Round.] If $u≥{1\over2}$, increase $U↓{-M}$ by 1.\xskip
(If $u={1\over2}$
exactly, another rounding rule such as ``increase $U↓{-M}$ by 1 only when it is
odd'' might be preferred.)\quad\blackslug

\yskip\noindent Step A4 will never increase $U↓{-M}$ from $B-1$ to $B$; for if
$U↓{-M}=B-1$ we must have $M>0$, but no $(M-1)$-place fraction was sufficiently
accurate. Steele and White
go on to consider floating-point conversions in their paper [to appear].

\ansno 4. (a)\9 $1/2↑k=5↑k/10↑k$.\xskip(b) Every prime divisor of $b$ divides $B$.

\ansno 5. Iff $10↑n-1≤c<w$, cf.\ (3).

\ansno 7. $αu≤ux≤αu+u/w≤αu+1$, hence $\lfloor αu\rfloor≤\lfloor ux\rfloor≤
\lfloorαu+1\rfloor$. Furthermore, in the special case cited we have $ux<αu+α$ and
$\lfloorαu\rfloor=\lfloorαu+α-ε\rfloor$.

\mixans 8. {⊗⊗ENT1⊗0\cr
⊗⊗LDA⊗U\cr
\\⊗1H⊗MUL⊗=1//10=\cr
\\⊗3H⊗STA⊗TEMP\cr
⊗⊗MUL⊗=-10=\cr
\\⊗⊗SLAX⊗5\cr
\\⊗⊗ADD⊗U\cr
⊗⊗JANN⊗2F\cr
\\⊗⊗LDA⊗TEMP⊗(Can occur only on\cr
⊗⊗DECA⊗1⊗\qquad the first iteration,\cr
⊗⊗JMP⊗3B⊗\qquad by exercise 7.)\cr
\\⊗2H⊗STA⊗ANSWER,1⊗(May be minus zero.)\cr
\\⊗⊗LDA⊗TEMP\cr
⊗⊗INC1⊗1\cr
⊗⊗JAP⊗1B⊗\quad\blackslug\cr}

\ansno 9. If $x↑\prime$ is an integer, $x-ε≤x↑\prime≤x$, then $(1+1/n)x-
\biglp(1+1/n)ε+1-1/n\bigrp≤x↑\prime+\lfloor x↑\prime/n\rfloor≤(1+1/n)x$.
Hence if $α$ is the binary fraction satisfying
$$1/10 - 2↑{-35} < α = (.000110011001100110011001100110011)↓2 < 1/10,$$
we find that $αu - ε ≤ v ≤ αu$ at the end of the computation, where
$$ε = \textstyle{7\over 8} + (.100010001010100011001000101010001)↓2 <
{3\over 2}.$$
Hence $u/10 - 2 < u/10 - \biglp ε + (1/10 - α)u\bigrp
≤ v ≤ αu < u/10$. Since $v$ is an integer, the proof is complete.

\ansno 10. (a)\9 Shift right one;\xskip (b) Extract left bit of each
group;\xskip (c) Shift result of (b) right two;\xskip (d) Shift result of
(c) right one, and add to result of (c);\xskip (e) Subtract result
of (d) from result of (a).

{\annskip\vbox{\baselineskip0pt\lineskip0pt
\def\dot{\hbox to 4.5pt{\hfill.\hfill}}
\def\\{\lower 2.25pt\vbox to 11pt{}}
\def\¬{$-$ \\}
\def\bar{\lower 1pt\vbox to 2.4pt{}}
\halign{\hbox to 19pt{\hfill#}⊗\hfill#⊗#\hfill\qquad⊗#\hfill\cr
\bf 11. ⊗\\⊗5\dot7\97\92\91\cr
⊗\¬⊗1\90\cr
⊗\bar⊗\vbox{\hrule width 13.5pt}\cr
⊗\\⊗4\97\dot7\92\91\cr
⊗\¬⊗\9\99\94\cr
⊗\bar⊗\vbox{\hrule width 22.5pt}\cr
⊗\\⊗3\98\93\dot2\91\cr
⊗\¬⊗\9\97\96\96\cr
⊗\bar⊗\vbox{\hrule width 31.5pt}\cr
⊗\\⊗3\90\96\96\dot1\cr
⊗\¬⊗\9\96\91\93\92\cr
⊗\bar⊗\vbox{\hrule width 40.5pt}\cr
⊗\lower 4.25pt\vbox to 13pt{}⊗2\94\95\92\99⊗\sl Answer: $(24529)↓{10}$.\cr}}}

\ansno 12. First convert the ternary number
to nonary (radix 9) notation, then proceed as in octal-to-decimal
conversion but without doubling. Decimal to nonary is similar.
In the given example, we have
$$\hbox to size{\vbox{\baselineskip0pt\lineskip0pt
\def\dot{\hbox to 4.5pt{\hfill.\hfill}}
\def\\{\lower 2.25pt\vbox to 11pt{}}
\def\¬{$-$ \\}
\def\bar{\lower 1pt\vbox to 2.4pt{}}
\halign{\hfill#⊗#\hfill⊗\qquad#\hfill\cr
\\⊗1\dot7\96\94\97\92\93\cr
\¬⊗\9\91\cr
\bar⊗\vbox{\hrule width 13.5pt}\cr
\\⊗1\96\dot6\94\97\92\93\cr
\¬⊗\9\91\96\cr
\bar⊗\vbox{\hrule width 22.5pt}\cr
\\⊗1\95\90\dot4\97\92\93\cr
\¬⊗\9\91\95\90\cr
\bar⊗\vbox{\hrule width 31.5pt}\cr
\\⊗1\93\95\94\dot7\92\93\cr
\¬⊗\9\91\93\95\94\cr
\bar⊗\vbox{\hrule width 40.5pt}\cr
\\⊗1\92\91\99\93\dot2\93\cr
\¬⊗\9\91\92\91\99\93\cr
\bar⊗\vbox{\hrule width 49.5pt}\cr
\\⊗1\90\99\97\93\99\dot3\cr
\¬⊗\9\91\90\99\97\93\99\cr
\bar⊗\vbox{\hrule width 58.5pt}\cr
\\⊗\9\99\98\97\96\95\94⊗\sl Answer: $(987654)↓{10}$.\cr}}\hfill
\vbox{\baselineskip0pt\lineskip0pt
\def\dot{\hbox to 4.5pt{\hfill.\hfill}}
\def\\{\lower 2.25pt\vbox to 11pt{}}
\def\¬{$+$ \\}
\def\bar{\lower 1pt\vbox to 2.4pt{}}
\halign{\hfill#⊗#\hfill⊗\qquad#\hfill\cr
\\⊗\9\99\dot8\97\96\95\94\cr
\¬⊗\9\9\9\99\cr
\bar⊗\vbox{\hrule width 22.5pt}\cr
\\⊗1\91\98\dot7\96\95\94\cr
\¬⊗\9\91\91\98\cr
\bar⊗\vbox{\hrule width 31.5pt}\cr
\\⊗1\93\91\96\dot6\95\94\cr
\¬⊗\9\91\93\91\96\cr
\bar⊗\vbox{\hrule width 40.5pt}\cr
\\⊗1\94\94\98\93\dot5\94\cr
\¬⊗\9\91\94\94\98\93\cr
\bar⊗\vbox{\hrule width 49.5pt}\cr
\\⊗1\96\90\94\92\98\dot4\cr
\¬⊗\9\91\96\90\94\92\98\cr
\bar⊗\vbox{\hrule width 58.5pt}\cr
\\⊗1\97\96\94\97\92\93⊗\sl Answer: $(1764723)↓9$.\cr}}}$$
%folio 768 galley 3a (C) Addison-Wesley 1978	*
\mixans 13. {⊗BUF⊗ALF⊗.\char'40\char'40\char'40\char'40⊗(Radix point on first
line)\cr
⊗⊗ORIG⊗*+39⊗\cr
⊗START⊗JOV⊗OFLO⊗Ensure overflow is off.\cr
⊗⊗ENT2⊗-40⊗Set buffer pointer.\cr
⊗8H⊗ENT3⊗10⊗Set loop counter.\cr
⊗1H⊗ENT1⊗$m$⊗Begin multiplication routine.\cr
⊗⊗ENTX⊗0\cr
⊗2H⊗STX⊗CARRY\cr
⊗⊗$\cdots$⊗⊗(See exercise 4.3.1--13, with\cr
⊗⊗J1P⊗2B⊗\qquad $v = 10↑9$ and $\.W = \.U$.)\cr
⊗⊗SLAX⊗5⊗$\rA ← \null$next nine digits.\cr
⊗⊗CHAR\cr
⊗⊗STA⊗BUF+40,2(2:5)⊗Store next nine digits.\cr
⊗⊗STX⊗BUF+41,2\cr
⊗⊗INC2⊗2⊗Increase buffer pointer.\cr
⊗⊗DEC3⊗1\cr
⊗⊗J3P⊗1B⊗Repeat ten times.\cr
⊗⊗OUT⊗BUF+20,2(PRINTER)\cr
⊗⊗J2N⊗8B⊗Repeat until both lines printed.\quad\blackslug\lower6pt\null\cr}

\ansno 14. Let $K(n)$ be the number of steps
required to convert an $n$-digit decimal number to binary and
at the same time to compute the binary representation of $10↑n$.
Then we have $K(2n) ≤ 2K(n) + O\biglp M(n)\bigrp$.\xskip{\sl Proof:}
Given the number $U = (u↓{2n-1} \ldotsm u↓0)↓{10}$, compute
$U↓1 = (u↓{2n-1} \ldotsm u↓n)↓{10}$ and $U↓0 = (u↓{n-1} \ldotsm
u↓0)↓{10}$ and $10↑n$, in $2K(n)$ steps, then compute $U =
10↑nU↓1 + U↓0$ and $10↑{2n} = 10↑n \cdot 10↑n$ in $O\biglp M(n)\bigrp$
steps. It follows that $K(2↑n) = O\biglp M(2↑n) + 2M(2↑{n-1})
+ 4M(2↑{n-2}) + \cdotss\bigrp = O\biglp nM(2↑n)\bigrp$.

$\biglp$Similarly, Sch\"onhage has observed
that we can convert a $(2↑n\lg 10)$-bit number $U$ from binary
to decimal, in $O\biglp nM(2↑n)\bigrp$ steps. First form $V
= 10↑{2↑{n-1}}$ in $O\biglp M(2↑{n-1}) +
M(2↑{n-2}) + \cdotss\bigrp = O\biglp M(2↑n))$ steps,
then compute $U↓0 = (U\mod V)$ and $U↓1 = \lfloor U/V\rfloor$
in $O\biglp M(2↑n))$ further steps, then convert $U↓0$ and $U↓1$.$\bigrp$

\ansno 18. Let $U =\hbox{round}↓B(u, P)$ and $v =\hbox{round}↓b(U, p)$.
We may assume that $u ≠ 0$, so that $U ≠ 0$ and $v ≠ 0$.\xskip{\sl Case
1}, $v < u$: Determine $e$ and $E$ such that $b↑{e-1} < u
≤ b↑e$, $B↑{E-1} ≤ U < B↑E$. Then $u ≤ U+{1\over2}B↑{E-P}$ and
$U ≤ u - {1\over 2}b↑{e-p}$; hence $B↑{P-1} ≤
B↑{P-E}U < B↑{P-E}u ≤ b↑{p-e}u ≤ b↑p$.\xskip{\sl Case 2}, $v > u$:
Determine $e$ and $E$ such that $b↑{e-1} ≤ u < b↑e$, $B↑{E-1}
< U ≤ B↑E$. Then $u ≥ U - {1\over 2}B↑{E-P}$ and $U ≥ u + {1\over
2}b↑{e-p}$; hence $B↑{P-1} ≤ B↑{P-E}(U - B↑{E-P}) < B↑{P-E}u
≤ b↑{p-e}u < b↑p$. Thus we have proved that $B↑{P-1} < b↑p$
whenever $v ≠ u$.

Conversely, if $B↑{P-1} < b↑p$, the above proof
suggests that the most likely example for which $u ≠ v$ will
occur when $u$ is a power of $b$ and at the same time it is
close to a power of $B$. We have $B↑{P-1}b↑p < B↑{P-1}b↑p +
{1\over 2}b↑p - {1\over 2}B↑{P-1} - {1\over 4} = (B↑{P-1} +
{1\over 2}) \* (b↑p - {1\over 2})$; hence $1 < α = 1/(1
- {1\over 2}b↑{-p}) < 1 + {1\over 2}B↑{1-P} = β$. There are
integers $e$ and $E$ such that $\log↓B α < e\log↓B b - E <
\log↓B β$, since Weyl's theorem (exercise 3.5--22) implies that
there is an integer $e$ with $0 < \log↓B α < (e\log↓B)\mod
1 < \log↓B β < 1$ when $\log↓B b$ is irrational. Hence $α < b↑e/B↑E
< β$, for some $e$ and $E$.\xskip (Such $e$ and $E$ may also be found
by applying the theory of continued fractions, see Section 4.5.3.)\xskip
Now we have round$↓B(b↑e, P) = B↑E$, and round$↓b(B↑E, p) <
b↑e$.\xskip [{\sl CACM \bf 11} (1968), 47--50; {\sl Proc.\ Amer.\ Math.\ Soc.\
\bf 19} (1968), 716--723.]

\ansno 19. $m↓1 = (\.{FOFOFOFO})↓{16}$, $c↓1 = 1 - 10/16$ makes $U
= \biglp (u↓7u↓6)↓{10} \ldotsm (u↓1u↓0)↓{10}\bigrp ↓{256}$; then $m↓2
= (\.{FFOOFFOO})↓{16}$, $c↓2 = 1 - 10↑2/16↑2$ makes
 $U = \biglp (u↓7u↓6u↓5u↓4)↓{10}(u↓3u↓2u↓1u↓0)↓{10}\bigrp
↓{65536}$; and $m↓3 = (\.{FFFFOOOO})↓{16}$, $c↓3 = 1 - 10↑4/16↑4$ finishes the
job.\xskip
(Cf.\ exercise 14. This technique is due to Roy A. Keir, circa 1958.)

%folio 769 galley 3b (C) Addison-Wesley 1978	*
\ansbegin{4.5.1}

\ansno 1. Test whether or not $uv↑\prime
< u↑\prime v$, since the denominators are positive.

\ansno 2. If $c > 1$ divides both $u/d$ and $v/d$, then $cd$
divides both $u$ and $v$.

\ansno 3. Let $p$ be prime. If $p↑e$ is a divisor
of $uv$ and $u↑\prime v↑\prime$ for $e ≥ 1$, then either $p↑e\rslash
u$ and $p↑e\rslash v↑\prime$ or $p↑e\rslash u↑\prime$ and $p↑e\rslash
v$; hence $p↑e\rslash\gcd(u, v↑\prime )\gcd(u↑\prime , v)$.
The converse follows by reversing the argument.

\ansno 4. Let $d↓1 =\ gcd(u, v)$, $d↓2 =\gcd(u↑\prime , v↑\prime
)$; the answer is $w = (u/d↓1)(v↑\prime /d↓2)\hbox{sign}(v)$, $w↑\prime
= |(u↑\prime /d↓2)(v/d↓1)|$, with a ``divide by zero'' error
message if $v = 0$.

\ansno 5. $d↓1 = 10$, $t = 17 \cdot 7 - 27 \cdot
12 = -205$, $d↓2 = 5$, $w = -41$, $w↑\prime = 168$.

\ansno 6. Let $u↑{\prime\prime} = u↑\prime /d↓1$, $v↑{\prime\prime}
= v↑\prime /d↓1$; we want to show that $\gcd(uv↑{\prime\prime} + u↑{\prime\prime}
v,d↓1)=\gcd(uv↑{\prime\prime}+u↑{\prime\prime}v,d↓1u↑{\prime\prime}v↑{\prime\prime}
)$. If $p$ is a prime that divides $u↑{\prime\prime} $, then $p$ does
not divide $u$ or $v↑{\prime\prime} $, so $p$ does not divide $uv↑{\prime\prime}
+ u↑{\prime\prime} v$. A similar argument holds for prime divisors of
$v↑{\prime\prime} $, so no prime divisors of $u↑{\prime\prime} v↑{\prime\prime}$
affect the given gcd.

\ansno 7. $(N - 1)↑2 + (N - 2)↑2 = 2N↑2 - (6N - 5)$. If the
inputs are $n$-bit binary numbers, $2n + 1$ bits may be necessary
to represent $t$.

\ansno 8. For multiplication and division these quantities will
obey the rules $x/0 =\hbox{sign}(x)∞$, $(\pm ∞) \times x = x \times
(\pm ∞) = (\pm ∞)/x = \pm\hbox{sign}(x)∞$, $x/(\pm ∞) = 0$, provided
that $x$ is finite and nonzero, without change to the algorithms
described. Furthermore, the algorithms can readily be modified
so that $0/0 = 0 \times (\pm ∞) = (\pm ∞) \times 0 = \hbox{``}(0/0)$'',
where the latter is a representation of ``undefined''; and so
that if either operand is ``undefined'' the result will be ``undefined''
also. Since the multiplication and division subroutines can
yield these fairly natural rules of ``extended arithmetic,''
it is sometimes worth while to modify the addition and subtraction
operations so that they satisfy the rules $x \pm ∞ = \pm ∞$,
$x \pm (-∞) = \mp ∞$, for $x$ finite; $(\pm ∞) + (\pm ∞) = \pm
∞ - (\mp ∞) = \pm ∞$, $(\pm ∞) + (\mp ∞) = (\pm∞) - (\pm ∞) = (0/0)$;
and if either or both operands is $(0/0)$, so is the result. Equality
tests and comparisons may be treated in a similar manner.

The above remarks are independent of ``overflow''
indications. If $∞$ is being used to suggest overflow, it is incorrect
to let $1/∞$ be equal to zero,
lest inaccurate results be regarded as true answers. It is far
better to represent overflow by $(0/0)$, and to adhere to the convention
that the result of any operation is undefined if at least one
of the inputs is undefined. This type of overflow indication
has the advantage that final results of an extended calculation
reveal exactly which answers are defined and which are not.

\ansno 9. If $u/u↑\prime ≠ v/v↑\prime $, then
$$1 ≤ |uv↑\prime - u↑\prime v| = u↑\prime v↑\prime |(u/u↑\prime
) - (v/v↑\prime )| < |2↑{2n}(u/u↑\prime ) - 2↑{2n}(v/v↑\prime)|;$$
two quantities differing by more than unity cannot
have the same ``floor.'' $\biglp$In other words, the first $2n$
bits to the right of the binary point are enough to characterize
the value of the fraction, when there are $n$-bit denominators.
We cannot improve this to $2n - 1$ bits, for if $n = 4$ we have
${1\over 13} = (.00010011\ldotsm)↓2$, ${1\over 14} = (.00010010
\ldotsm)↓2$.$\bigrp$

\ansno 11. To divide by $(v + v↑\prime \sqrt{5}\,)/v↑{\prime\prime} $,
when $v$ and $v↑\prime$ are not both zero, multiply by the reciprocal,
$(v-v↑\prime\sqrt5\,)v↑{\prime\prime}/(v↑2-5v↑{\prime2})$, and reduce
to lowest terms.
%folio 770 galley 4 (C) Addison-Wesley 1978	*
\ansbegin{4.5.2}

\ansno 1. Substitute min, max, + consistently
for gcd, lcm, $\times$.

\ansno 2. For prime $p$, let $u↓p$, $v↓{1p}$, $\ldotss$, $v↓{np}$
be the exponents of $p$ in the canonical factorizations of $u$,
$v↓1$, $\ldotss$, $v↓n$. By hypothesis, $u↓p ≤ v↓{1p} +\cdots
+ v↓{np}$. We must show that $u↓p ≤ \min(u↓p, v↓{1p})
+\cdots + \min(u↓p, v↓{np})$, and this is certainly
true if $u↓p$ is greater than or equal to each $v↓{jp}$, or
if $u↓p$ is less than some $v↓{jp}$.

\ansno 3. {\sl Solution 1:} A one-to-one correspondence
is obtained if we set $u = \gcd(d, n)$, $v = n↑2/\lcm(d, n)$
for each divisor $d$ of $n↑2$.\xskip {\sl Solution 2:} If $n = p↑{e↓1}↓1
\ldotss p↑{e↓r}↓r$, the number in each case is $(2e↓1 + 1)
\ldotsm (2e↓r + 1)$.

\ansno 4. See exercise 3.2.1.2--15(a).

\ansno 5. Shift $u$ and $v$ right until neither is a
multiple of 3, remembering the proper power of 3 that will appear in the gcd.
Each subsequent iteration sets $t ← u + v$ or $t ← u - v$
(whichever is a multiple of 3), shifts $t$ right until
it is not a multiple of 3, then replaces $\max(u, v)$ by the
result.
$$\vbox{\halign{\hfill#⊗\qquad\hfill#⊗\qquad#\hfill\cr
$u$\hfill⊗$v$\hfill⊗\hfill$t$\qquad\cr
\noalign{\vskip 3pt}
13634⊗24140⊗10506, 3502;\cr
13634⊗3502⊗17136, 5712, 1904;\cr
1904⊗3502⊗5406, 1802;\cr
1904⊗1802⊗102, 34;\cr
34⊗1802⊗1836, 612, 204, 68;\cr
34⊗68⊗102, 34;\cr
34⊗34⊗0.\cr}}$$
The evidence that $\gcd(40902,24140)=34$ is now overwhelming.

\ansno 6. The probability that both $u$ and
$v$ are even is ${1\over 4}$; the probability that both are
multiples of four is ${1\over 16}$; etc. Thus $A$ has the distribution
given by the generating function
$$\textstyle{3\over 4} + {3\over 16}z + {3\over 64}z↑2 +\cdots
 =\dispstyle {3/4\over1 - z/4}.$$
The mean is ${1\over 3}$, and the standard deviation
is $\sqrt{\,{2\over 9} + {1\over 3} - {1\over 9}} = {2\over
3}$. If $u, v$ are independently and uniformly distributed with
$1 ≤ u, v < 2↑N$, then some small correction terms are needed;
the mean is then actually
$$\chop to 11pt{(2↑N - 1)↑{-2} \sum ↓{1≤k≤N}(2↑{N-k} - 1)↑2 = {\textstyle{1\over
3}}-{\textstyle{4\over 3}}(2↑N - 1)↑{-1} + N(2↑N - 1)↑{-2}.}$$

\ansno 7. When $u, v$ are not both even, each of
the cases (even, odd), (odd, even), (odd, odd) is equally probable,
and $B = 1$, 0, 0 in these cases. Hence $B = {1\over 3}$ on
the average. Actually, as in exercise 6, a small correction
could be given to be strictly accurate when $1 ≤ u, v < 2↑N$;
the probability that $B = 1$ is actually
$$\chop to 11pt{(2↑N - 1)↑{-2}\sum ↓{1≤k≤N}(2↑{N-k} - 1)2↑{N-k} = {\textstyle{1\over
3}} - {\textstyle{1\over 3}}(2↑N - 1)↑{-1}.}$$

\ansno 8. $E$ is the number of subtraction cycles
in which $u > v$, plus one if $u$ is odd after step B1. If we
change the inputs from $(u, v)$ to $(v, u)$, the value of $C$
stays unchanged, while $E$ becomes $C - E$ or $C - E - 1$; the
latter case occurs iff $u$ and $v$ are both odd after step B1,
and this has probability ${1\over 3} + {2\over 3}/(2↑N -
1)$. Hence \def\\{↓{\hbox{\:e ave}}}
$$\textstyle E\\ = C\\ - E\\ - {1\over 3} - {2\over
3}/(2↑N - 1).$$

\ansno 9. The binary algorithm first gets to B6
with $u = 1963$, $v = 1359$; then $t ← 604$, 302, 151, etc. The
gcd is 302. Using Algorithm X we find that $2 \cdot 31408 - 23
\cdot 2718 = 302$.

\ansno 10. (a)\9 Two integers are relatively prime iff they are
not both divisible by any prime number.\xskip (b) Rearrangement of
the sum in (a), in terms of the denominators $k = p↓1 \ldotsm
p↓r$.\xskip $\biglp$Note
that each of the sums in (a) and (b) is actually finite.$\bigrp$\xskip
(c) $(n/k)↑2 - \lfloor n/k\rfloor ↑2 = O(n/k)$, so $q↓n - \sum
↓{1≤k≤n\lower2pt\null} \mu (k)(n/k)↑2 = \sum ↓{1≤k≤n} O(n/k) = O(nH↓n)$.\xskip (d)
$\sum ↓{d\rslash n} \mu (d) = \delta ↓{1n}$.\xskip [In fact, we have
the more general result
$$\sum ↓{d\rslash n} \mu (d)\left(n\over d\right)↑s = n↑s - \sum
\left(n\over p\right)↑s +\cdotss,$$
as in part (b), where the sums are over the prime
divisors of $n$, and this is equal to $n↑s(1 - 1/p↑{s}↓{1}) \ldotsm
(1 - 1/p↑{s}↓{r})$ if $n = p↑{e↓1}↓{1} \ldotss p↑{e↓r}↓{r}$.]

{\sl Notes:} Similarly, we find that a set of $k$ integers is relatively prime with
probability $1\hbox{\:a/}\biglp\sum↓{n≥1}1/n↑k\bigrp$. This proof
of Theorem D is due to F. Mertens,
{\sl J. f\"ur die reine und angew.\ Math.\ \bf 77}
(1874), 289--291. The technique actually gives a much stronger result,
namely that $6π↑{-2}n↑2 + O(n\log n)$ pairs of integers $u\in\hbox{\:a[}
f(n), f(n) + n\bigrp$, $v \in \hbox{\:a[} g(n), g(n) +
n\bigrp$ are relatively prime, for arbitrary $f$
and $g$.

\ansno 11. (a)\9 $6/π↑2$ times $1 + {1\over 4} + {1\over 9}$,
namely $49/(6π↑2) \approx .82746$.\xskip (b) $6/π↑2$ times 1$/1 + 2/4
+ 3/9 +\cdotss$, namely $∞$.\xskip $\biglp$This is true in spite of the
result of exercise 12, and in spite of the fact that the average
value of $\ln\gcd(u, v)$ is a small, finite number.$\bigrp$

\ansno 12. Let $\sigma (n)$ be the number
of positive divisors of $n$. The answer is
$$\sum ↓{k≥1} \sigma (k) \cdot {6\over π↑2k↑2} = {6\over π↑2}
\bigglp\sum ↓{k≥1} {1\over k↑2}\biggrp
↑2 = {π↑2\over 6} .$$
[Thus, the average is {\sl less} than 2, although
there are always at least two common divisors when $u$ and $v$ are
not relatively prime.]

%folio 771 galley 5 (C) Addison-Wesley 1978	*
\ansno 13. $1 + {1\over 9} + {1\over 25} +\cdots
= 1 + {1\over 4} + {1\over 9} +\cdots - {1\over
4}(1 + {1\over 4} + {1\over 9} +\cdotss)$.

\ansno 14. $v↓1 = \pm v/u↓3$, $v↓2 =\mp u/u↓3$ (the sign depends
on whether the number of iterations is even or odd). This follows
from the fact that $v↓1$ and $v↓2$ are relatively prime to each
other (throughout the algorithm), and that $v↓1u = -v↓2v$.\xskip [Hence
$v↓1u =\lcm(u, v)$ at the close of the algorithm, but this
is not an especially efficient way to compute the least common
multiple. For a generalization, see exercise 4.6.1--18.]

G. E. Collins has observed that $|u↓1| ≤ {1\over
2}v/u↓3$, $|u↓2| ≤ {1\over 2}u/u↓3$, at the termination of Algorithm
X\null, except in certain trivial cases, since the final value of
$q$ is usually $≥2$. This bounds the size of $|u↓1|$ and $|u↓2|$ throughout
the execution of the algorithm.

\ansno 15. Apply Algorithm X to $v$ and $m$, thus obtaining
a value $x$ such that $xv ≡ 1\modulo m$.\xskip (This can be done
by simplifying Algorithm X so that $u↓2$, $v↓2$, and $t↓2$ are not computed,
since they are never used in the answer.)\xskip Then set $w ← ux
\mod m$.\xskip [It follows, as in exercise 30, that this process requires
$O(n↑2)$ units of time, when it is applied to large $n$-bit
numbers. An alternative to Algorithm X appears in exercise 35.]

\ansno 16. (a)\9 Set $t↓1 = x + 2y + 3z$; then $3t↓1 + y + 2z
= 1$, $5t↓1 - 3y - 20z = 3$. Eliminate $y$, then $14t↓1 - 14z
= 6$: No solution.\xskip (b) This time $14t↓1 - 14z = 0$. Divide by
14, eliminate $t↓1$; the general solution is $x = 8z - 2$, $y
= 1 - 5z$, $z$ arbitrary.

\ansno 17. Let $u↓1$, $u↓2$, $u↓3$, $v↓1$, $v↓2$, $v↓3$ be multiprecision
variables, in addition to $u$ and $v$. The extended algorithm will
act the same on $u↓3$ and $v↓3$ as Algorithm L does on $u$ and
$v$. New multiprecision operations are to set $t ← Au↓j$, $t ←
t + Bv↓j$, $w ← Cu↓j$, $w ← w + Dv↓j$, $u↓j ← t$, $v↓j ← w$ for all
$j$, in step L4; also if $B = 0$ in that step to set $t ← u↓j
- qv↓j$, $u↓j ← v↓j$, $v↓j ← t$ for all $j$ and for $q = \lfloor
u↓3/v↓3\rfloor $. A similar modification is made to step L1
if $v↓3$ is small. The inner loop (steps L2 and L3) is unchanged.

\ansno 18. If $mn = 0$, the probabilities of the lattice-point
model in the test are exact, so we may assume that $m ≥ n >
0$. {\sl Valida vi}, the following values have been obtained:

\yskip{\sl Case 1}, $m = n$.\xskip From $(n, n)$ we go to $(n
- t, n)$ with probability $t/2↑t - 5/2↑{t+1} + 3/2↑{2t}$, for
$2 ≤ t < n$.\xskip (These values are ${1\over 16}$, ${7\over 64}$,
${27\over 256}$, $\ldotss\,$.)\xskip To $(0, n)$ the probability is $n/2↑{n-1}
- 1/2↑n + 1/2↑{2n-2}$. To $(n, k)$ the probability is the same
as to $(k, n)$. The algorithm terminates with probability $1/2↑{n-1}$.

\yskip{\sl Case 2}, $m = n + 1$.\xskip From $(n + 1,
n)$ we get to $(n, n)$ with probability ${1\over 8}$ when $n
> 1$, or 0 when $n = 1$; to $(n - t, n)$ with probability $11/2↑{t+3}
- 3/2↑{2t+1}$, for $1 ≤ t < n - 1$.\xskip (These values are ${5\over
16}$, ${1\over 4}$, $\ldotss\,$.)\xskip We get to $(1, n)$ with probability
$5/2↑{n+1} - 3/2↑{2n-1}$, for $n > 1$; to $(0, n)$ with probability
$3/2↑n - 1/2↑{2n-1}$.

\yskip{\sl Case 3}, $m ≥ n + 2$.\xskip The probabilities are
given by the following table:
$$\vbox{\baselineskip13pt
\halign{$#:\hfill$\qquad⊗$#\hfill$\cr
(m - 1, n)⊗1/2 - 3/2↑{m-n+2} - \delta ↓{n1}/2↑{m+1};\cr
(m - t, n)⊗1/2↑t + 3/2↑{m-n+t+1},\qquad 1 < t < n;\cr
(m - n, n)⊗1/2↑n + 1/2↑m,\qquad n > 1;\cr
(m - n - 1, n)⊗1/2↑{n+1} + 1/2↑{m-1};\cr
(m - n - t, n)⊗1/2↑{n+t},\qquad 1 < t < m - n;\cr
(0, n)⊗1/2↑{m-1}.\cr}}$$

\yskip[{\sl Note:} Although these exact probabilities
will certainly improve on the lattice-point model considered
in the text, they lead to recurrence relations of much greater
complexity; and they will not provide the true behavior of Algorithm
B\null, since for example the probability that $\gcd(u, v) = 5$ is
different from the probability that $\gcd(u, v) = 7$.]

\ansno 19. $A↓{n+1} = a + \sum ↓{1≤k≤n} 2↑{-k}A↓{(n+1)(n-k)}
+ 2↑{-n}b = a + \sum ↓{1≤k≤n} 2↑{-k}A↓{n(n-k)} + {1\over 2}c{(1
- 2↑{-n})} + 2↑{-n}b = a + {1\over 2}A↓{n(n-1)} + {1\over 2}(A↓n - a) + {1\over
2}c(1 - 2↑{-n})$;
now substitute for $A↓{n(n-1)}$ from (36).

\ansno 20. The paths described in the hint have the same probability,
but the subsequent termination of the algorithm has a different
probability; thus $λ = k + 1$ with probability $2↑{-k}$ times
the probability that $λ = 1$. Let the latter probability be
$p$. We know from the text that $λ = 0$ with approximate probability
${3\over 5}$; hence ${2\over 5} = p(1 + {1\over 2} + {1\over
4} + \cdotss) = 2p$. The average is $p(1
+ {2\over 2} + {3\over 4} + {4\over 8} +\cdotss)
= p(1 + {1\over 2} + {1\over 4} + {1\over 8} +\cdotss
)↑2 = 4p$.\xskip [The exact probability that $λ = 1$ is ${1\over 5}
- {6\over 5}(-{1\over 4})↑n$ if $m > n ≥ 1$, ${1\over 5}
- {16\over 5}(-{1\over 4})↑n$ if $m = n ≥ 2$.]

\ansno 21. Show that for fixed $v$ and for $2↑m < u < 2↑{m+1}$,
when $m$ is large, each subtraction-shift cycle of the algorithm
reduces $\lfloor\lg u\rfloor$ by two, on the average.

\ansno 22. Exactly $(N - m)2↑{m-1+\delta↓{m0}}$ integers
$u$ in the range $1 ≤ u < 2↑N$ have $\lfloor\lg u\rfloor = m$,
after $u$ has been shifted right until it is odd.

\ansno 23. The first sum is $2↑{2N-2} \sum ↓{0≤m<n<N} mn2↑{-m-n}\biglp
(α + β)N + \gamma - αm - βn\bigrp $. Since $\sum ↓{0≤m<n} m2↑{-m}=2
- (n+ 1)2↑{1-n}$ and $\sum ↓{0≤m<n} m(m - 1)2↑{-m} = 4 - (n↑2
+ n + 2)2↑{1-n}$, the sum on $m$ is $2↑{2N-2} \sum ↓{0≤n<N}
n2↑{-n}\biglp (\gamma - α + (α + β)N)(2 - (n + 1)2↑{1-n} - α(4
- (n↑2 + n + 2)2↑{1-n}) - βn\bigrp = 2↑{2N-2}\biglp (α + β)N
\sum ↓{0≤n<N} n2↑{-n}(2 - (n + 1)2↑{1-n}) + O(1)\bigrp $. Thus
the coefficient of $(α + β)N$ in the answer is found to be $2↑{-2}\biglp
4 - ({4\over 3})↑3\bigrp = {11\over 27}$. A similar argument
applies to the other sum.

[{\sl Note:} The {\sl exact} value of the
sums may be obtained after some tedious calculation by means
of the general formula
$$\sum ↓{0≤k<n}k↑{\underline m}\,z↑k = {m!\,z↑m\over (1 - z)↑{m+1}} - \sum ↓{0≤k≤m}
{m↑{\underline k}\,n↑{\underline{m-k}}\,z↑{n+k}\over (1 - z)↑{k+1}},$$
which follows from summation by parts.]

\ansno 24. Solving a recurrence similar to (34), we find
that the number of times is $A↓{mn}$, where $A↓{00} = 1$, $A↓{0n}
= (n + 3)/2$, $A↓{nn} = {8\over 5} - (3n + 13)/(9 \cdot 2↑n) +
{128\over 45}(-{1\over 4})↑n$ if $n ≥ 1$, $A↓{mn} = {8\over5}-2/(3\cdot2↑n)+
{16\over 15}(-{1\over4})↑n$ if $m > n ≥ 1$. Since the condition $u = 1$ or $v = 1$
is therefore satisfied only about 1.6 times in an average run,
it is not worth making the suggested test each time step B5
is performed.\xskip (Of course the lattice model is not completely
accurate, but it seems reasonable to believe that it is not
too inaccurate for this application.)

\ansno 25. (a)\9 $F↓{n+1}(x)=\sum↓{d≥1}2↑{-d}\,\hbox{probability that}\biglp X↓n<1$
and $2↑d/(X↑{-1}↓{n} - 1) < x$ or $X↓n > 1$ and $(X↓n - 1)/2↑d
< x\bigrp = \sum ↓{d≥1} 2↑{-d}\biglp F↓n(1/(1 + 2↑dx↑{-1}))
+ F↓n(1 + 2↑dx) - F↓n(1)\bigrp $.\xskip (b) $G↓{n+1}(x) = 1 - \sum
↓{d≥1}2↑{-d}\biglp G↓n(1/(1 + 2↑dx)) - G↓n(1/(1 + 2↑dx↑{-1}))\bigrp
$.\xskip (c) $H↓n(x)=\sum↓{d≥1}2↑{-d}\,\hbox{probability that}\biglp Y↓n ≤ x$ and
$(1 - Y↓n)/2↑d ≤ x\bigrp = \sum ↓{d≥1} 2↑{-d}\max\biglp 0,\penalty0{G↓n(x)
- G↓n(1 - 2↑dx)}\bigrp $.

Starting with $G↓0(x) = x$ we get rapid convergences
to a limiting distribution where $$\biglp G(.1), \ldotss , G(.9)\bigrp
= (.2750, .4346, .5544, .6507, .7310, .7995, .8590, .9114, .9581).$$
The expected value of $\ln\biglp\max(u↓n,v↓n)/\,\max(u↓{n+1},
v↓{n+1})\bigrp$ is $\int ↑{1}↓{0} H↓n(t)\,dt/t$, and Brent has shown
that this can be written
$$\int ↑{1}↓{1/3} {G↓n(t)\over t}\,dt - \int ↑{1/3}↓{0}
{G↓n(t)\over 1 - t}\,dt + \sum ↓{k≥1} 2↑{-k}\int↑{1/(1+2↑k)}↓{1/(1+2↑{k+1})}
{G↓n(t)\over t(1 - t)}\,dt.$$

\ansno 26. By induction, the length is
$m + \lfloor n/2\rfloor$ when $m ≥ n$, except that when $m =
n = 1$ there is {\sl no} path to $(0, 0)$.
%folio 776 galley 6a (C) Addison-Wesley 1978	*
\ansno 27. Let $a↓n = \biglp 2↑n - (-1)↑n\bigrp /3$; then $a↓0$,
$a↓1$, $a↓2$, $\ldots = 0$, 1, 1, 3, 5, 11, 21, $\ldotss\,$.\xskip (This sequence
of numbers has an interesting pattern of zeros and ones in its
binary representation. Note that $a↓n = a↓{n-1} + 2a↓{n-2}$,
and $a↓n + a↓{n+1} = 2↑n$.)\xskip For $m > n$, let $u = 2↑{m+1} -
a↓{n+2}$, $v = a↓{n+2}$. For $m = n > 0$, let $u = a↓{n+2}$, $v
= 2a↓{n+1}$, or $u = 2a↓{n+1}$, $v = a↓{n+2}$ (depending on which
is larger). Another example for $m = n > 0$ is $u = 2↑{n+1}
- 1$, $v = 2↑{n+1} - 2$; this takes more shifts, and gives $C
= n + 1$, $D = 2n$, $E = 1$.

\ansno 28. This is a problem where it appears to be necessary
to prove {\sl more} than was asked just to prove what was asked.
Let us prove the following: {\sl If $u$ and $v$ are positive integers,
Algorithm B does $≤1 + \lfloor\lg\max(u, v)\rfloor$ subtraction
steps; and if equality holds, then $\lfloor\lg (u
+ v)\rfloor > \lfloor\lg\max(u, v)\rfloor $.}

For convenience, let us assume that $u ≥ v$; let
$m = \lfloor\lg u\rfloor$, $n = \lfloor\lg v\rfloor$;
and let us use the ``lattice-point'' terminology, saying that
we are ``at point $(m, n)$.'' The proof is by induction on $m
+ n$.

\yskip{\sl Case 1}, $m = n$.\xskip Clearly, $\lfloor\lg(u
+ v)\rfloor > \lfloor\lg u\rfloor$ in this case. If $u =
v$ the result is trivial; otherwise the next subtraction-shift
cycle takes us to a point $(m - k, m)$. By induction, at most
$m + 1$ further subtraction steps will be required; but if $m
+ 1$ more {\sl are} needed, we have $\lfloor\lg\biglp (u -
v)2↑{-r} + v\bigrp \rfloor > \lfloor\lg v\rfloor$, where
$r ≥ 1$ is the number of right shifts that were made. This is
impossible, since $(u - v)2↑{-r} + v < (u - v) + v = u$. So
at most $m$ further steps are needed.

\yskip{\sl Case 2}, $m > n$.\xskip The next subtraction
step takes us to $(m - k, n)$, and at most $1 + \max(m - k, n)
≤ m$ further steps will be required. Now if $m$ further steps
{\sl are} required, then $u$ has been replaced by $u↑\prime
= (u - v)2↑{-r}$ for some $r ≥ 1$. By induction, $\lfloor\lg(u↑\prime
+ v)\rfloor ≥ m$; hence
$$\lfloor\lg(u + v)\rfloor = \lfloor\lg 2\biglp (u -
v)/2 + v\bigrp \rfloor ≥ \lfloor\lg 2(u↑\prime + v)\rfloor
≥ m + 1 > \lfloor\lg u\rfloor.$$

\ansno 29. Subtract the $k$th column from the $2k$th,
$3k$th, $4k$th, etc., for $k = 1$, 2, 3, $\ldotss\,$. The result
is a triangular matrix with $x↓k$ on the diagonal in column
$k$, where $m = \sum ↓{d\rslash m} x↓d$. It follows that $x↓m
= \varphi (m)$, so the determinant is $\varphi (1)\varphi (2)
\ldotsm \varphi (n)$.\xskip [In general, ``Smith's determinant,'' in
which the $(i, j)$ element is $f\biglp\gcd(i, j)\bigrp$ for
an arbitrary function $f$, is equal to $\prod↓{1≤m≤n} \sum ↓{d\rslash
m} \mu (m/d)f(d)$, by the same argument. See L. E. Dickson,
{\sl History of the Theory of Numbers \bf 1} (New York: Chelsea,
1952), 122--123.]

\ansno 30. To determine $A$ and $r$ such that $u = Av + r$, $0
≤ r < v$, using ordinary long division, takes $O\biglp (1 +
\log A)(\log u)\bigrp$ units of time. If the quotients during
the algorithm are $A↓1$, $A↓2$, $\ldotss$, $A↓m$, then $A↓1A↓2 \ldotsm
A↓m ≤ u$, so $\log A↓1 +\cdots + \log A↓m ≤ \log
u$. Also $m = O(\log u)$.

\ansno 31. In general, since $(a↑u - 1)\mod (a↑v - 1) = a↑{u\mod v}
- 1$ (cf.\ Eq.\ 4.3.2--19), we find that $\gcd(a↑m - 1, a↑n - 1)
= a↑{\gcd(m,n)} - 1$ for all positive integers $a$.

\ansno 32. Yes, to $O\biglp n(\log n)↑2(\log
\log n)\bigrp$, even if we also need to compute the sequence of
partial quotients that would be computed by Euclid's algorithm;
see A. Sch\"onhage, {\sl Acta Informatica
\bf 1} (1971), 139--144.\xskip [But Algorithm L is better in practice
unless $n$ is extremely large.]

\ansno 34. Keep track of the most significant and least significant
words of the operands (the most significant is used to guess
the sign of $t$ and the least significant is to determine the
amount of right shift), while building a $2 \times 2$ matrix $A$ of
single-precision integers such that $A{u\choose v}={u↑\prime w\choose v↑\prime w}$,
where $w$ is the computer word size and
where $u↑\prime$ and $v↑\prime$ are smaller
than $u$ and $v$.\xskip (Instead of dividing the simulated odd operand
by 2, multiply the other one by 2, until obtaining multiples
of $w$ after exactly $\lg w$ shifts.)\xskip Experiments
show this algorithm running four times as fast as Algorithm
L\null, on at least one computer.

\ansno 35. (Solution by Michael Penk.)

\algstep Y1. [Find power of 2.] Same as step B1.

\algstep Y2. [Initialize.] Set $(u↓1,u↓2,u↓3)←(1,0,u)$ and
$(v↓1,v↓2,v↓3)←(v,1-u,v)$. If $u$ is odd, set $(t↓1,t↓2,t↓3)←(0,-1,-v)$ and
go to Y4. Otherwise set $(t↓1,t↓2,t↓3)←(1,0,u)$.

\algstep Y3. [Halve $t↓3$.] If $t↓1$ and $t↓2$ are both even, set
$(t↓1,t↓2,t↓3)←(t↓1,t↓2,t↓3)/2$; otherwise set $(t↓1,t↓2,t↓3)←(t↓1+v,t↓2-u,t↓3)/2
$.\xskip(In the latter case, $t↓1+v$ and $t↓2-u$ will both be even.)

\algstep Y4. [Is $t↓3$ even?] If $t↓3$ is even, go back to Y3.

\algstep Y5. [Reset $\max(u↓3,v↓3)$.] If $t↓3>0$, set $(u↓1,u↓2,u↓3)←(t↓1,t↓2,t↓3)$;
otherwise set $(v↓1,v↓2,v↓3)←(v-t↓1,-u-t↓2,-t↓3)$.

\algstep Y6. [Subtract.] Set $(t↓1,t↓2,t↓3)←(u↓1,u↓2,u↓3)-(v↓1,v↓2,v↓3)$. Then
if $t↓1<0$, set $(t↓1,t↓2)←(t↓1+v,t↓2-u)$. If $t↓3≠0$, go back to B3. Otherwise
the algorithm terminates with $(u↓1,u↓2,u↓3\cdot2↑k)$ as the output.\quad\blackslug

\yyskip It is clear that the relations in (16) are preserved, and that $0≤u↓1,v↓1,
t↓1≤v$, $0≥u↓2,v↓2,t↓2≥-u$ after each of steps Y2--Y6. If $u$ is odd after step Y2,
then step Y3 can be simplified, since $t↓1$ and $t↓2$ are both even iff $t↓2$ is
even; similarly, if $v$ is odd, then $t↓1$ and $t↓2$ are both even iff $t↓1$ is
even.  Thus, as in Algorithm X, it is possible to suppress all calculations
involving $u↓2$, $v↓2$, and $t↓2$, provided that $v$ is odd after step Y2. This
condition is often known in advance (e.g., when $v$ is prime and we are trying to
compute $u↑{-1}$ modulo $v$).
%folio 777 galley 6b (C) Addison-Wesley 1978	*
\def\bslash{\char'477 } \def\vbslash{\char'477017 } % boldface slashes (vol. 2 only)
\ansbegin{4.5.3}

\ansno 1. The running time is about
$19.02T + 6$, just a trifle slower than Program 4.5.2A.

\ansno 2. $\dispstyle\left({Q↓n(x↓1,x↓2,\ldotss ,x↓{n-1},x↓n)\atop Q↓{n-1}(x↓2,
\ldotss ,x↓{n-1},x↓n)}\9 {Q↓{n-1}(x↓1,x↓2, \ldotss , x↓{n-1})\atop Q↓{n-2}(x↓2,
\ldotss , x↓{n-1})}\right)$.

\ansno 3. $Q↓n(x↓1, \ldotss , x↓n)$.

\ansno 4. By induction, or by taking the determinant
of the matrix product in exercise 2.

\ansno 5. When the $x$'s are positive, the $q$'s
of (9) are positive, and $q↓{n+1} > q↓{n-1}$; hence (9) is an
alternating series of decreasing terms, and it converges iff
$q↓nq↓{n+1} → ∞$. By induction, if the $x$'s are greater
than $ε$, we have $q↓n ≥ c(1 + ε/2)↑n$, where $c$ is chosen small enough to make
this inequality valid for $n=1$ and 2. But if $x↓n = 1/2↑n$ then $q↓n ≤ 2
- 1/2↑n$.

\ansno 6. It suffices to prove that $A↓1 = B↓1$;
and from the fact that $0 ≤ \bslash x↓1, \ldotss , x↓n\bslash
< 1$ whenever $x↓1, \ldotss , x↓n$ are positive integers, we
have $B↓1 = \lfloor 1/X\rfloor = A↓1$.

\ansno 7. Only $1\,2 \ldotsm n$ and $n \ldotsm 2\,1$.\xskip
(The variable $x↓k$ appears in exactly $F↓k\,F↓{n-k}$ terms; hence
$x↓1$ and $x↓n$ can only be permuted into $x↓1$ and $x↓n$. If
$x↓1$ and $x↓n$ are fixed by the permutation, it follows by
induction that $x↓2$, $\ldotss$, $x↓{n-1}$ are also fixed.)

\ansno 8. This is equivalent to
$${Q↓{n-2}(A↓{n-1}, \ldotss , A↓2) - XQ↓{n-1}(A↓{n-1}, \ldotss
, A↓1)\over Q↓{n-1}(A↓n, \ldotss , A↓2) - XQ↓n(A↓n, \ldotss ,
A↓1)} = -{1\over X↓n} ,$$
and by (6) this is equivalent to
$$X = {Q↓{n-1}(A↓2, \ldotss , A↓n) + X↓nQ↓{n-2}(A↓2, \ldotss
, A↓{n-1})\over Q↓n(A↓1, \ldotss , A↓n) + X↓nQ↓{n-1}(A↓1, \ldotss
, A↓{n-1})} .$$

\ansno 9. (a)\9 By definition.\xskip (b), (d) Prove this
when $n = 1$, then apply (a) to get the result for general $n$.\xskip
(c) Prove when $n = k + 1$, then apply (a).

\ansno 10. If $A↓0 > 0$, then $B↓0 = 0$, $B↓1 = A↓0$, $B↓2 = A↓1$,
$B↓3 = A↓2$, $B↓4 = A↓3$, $B↓5 = A↓4$, $m = 5$. If $A↓0 = 0$, then
$B↓0 = A↓1$, $B↓1 = A↓2$, $B↓2 = A↓3$, $B↓3 = A↓4$, $m = 3$. If $A↓0
= -1$ and $A↓1 = 1$, then $B↓0 = -(A↓2 + 2)$, $B↓1 = 1$, $B↓2 =
A↓3 - 1$, $B↓3 = A↓4$, $m = 3$. If $A↓0 = -1$ and $A↓1 > 1$,
then $B↓0 = -2$, $B↓2 = A↓1 - 2$, $B↓3 = A↓2$, $B↓4 = A↓3$, $B↓5 = A↓4$,
$m = 5$. If $A↓0 < -1$, then $B↓0 = -1$, $B↓1 = 1$, $B↓2 = -A↓0 -
2$, $B↓3 = 1$, $B↓4 = A↓1 - 1$, $B↓5 = A↓2$, $B↓6 = A↓3$, $B↓7 = A↓4$.\xskip
[Actually, the last three cases involve eight subcases; if any
of the $B$'s is set to zero, the values should be ``collapsed
together'' by using the rule of exercise 9(c). For example,
if $A↓0 = -1$, $A↓1 = A↓3 = 1$, we actually have $B↓0 = -(A↓2
+ 2)$, $B↓1 = A↓4 + 1$, $m = 1$. Double collapsing occurs when $A↓0
= -2$, $A↓1 = 1$.]
%folio 777 galley 7 Bad beginning. (C) Addison-Wesley 1978	*
\def\bslash{\char'477 } \def\vbslash{\char'477017 } % boldface slashes (vol. 2 only)
\ansno 11. Let $q↓n = Q↓n(A↓1, \ldotss , A↓n)$, $q↑\prime↓{n}
= Q↓n(B↓1, \ldotss , B↓n)$, $p↓n = Q↓{n+1}(A↓0, \ldotss , A↓n)$,
and $p↑\prime↓{n} = Q↓{n+1}(B↓0, \ldotss , B↓n)$. We have $X =
(p↓m + p↓{m-1}X↓m)/(q↓m + q↓{m-1}X↓m)$, $Y = (p↑\prime↓{n}
+ p↑\prime↓{n-1}Y↓n)/\penalty0(q↑\prime↓{n} + q↑\prime↓{n-1}Y↓n)$;
therefore if $X↓m = Y↓n$, the stated relation between $X$ and
$Y$ holds by (8). Conversely, if $X = (qY + r)/(sY + t)$, $|qt
- rs| = 1$, we may assume that $s ≥ 0$, and we can show that
the partial quotients of $X$ and $Y$ eventually agree, by induction
on $s$. The result is clear when $s = 0$, by exercise 9(d).
If $s > 0$, let $q = as + s↑\prime$, where $0 ≤ s↑\prime < s$. Then
$X = a + 1/\biglp (sY + t)/(s↑\prime Y + r - at)\bigrp $; since
$s(r - at) - ts↑\prime = sr - tq$, and $s↑\prime < s$, we know
by induction and exercise 10 that the partial quotients of $X$
and $Y$ eventually agree.\xskip [{\sl Note:} The fact that $m$
is always odd in exercise 10 shows, by a close inspection of
this proof, that $X↓m=Y↓n$ if and only if $X=(qY+r)/(sY+t)$, where
$qt-rs=(-1)↑{m-n}$.]

\ansno 12. (a)\9 Since $V↓nV↓{n+1}=D-U↓{\!n}↑2$, we know that $D-U↓{\!n+1}↑2$ is
a multiple of $V↓{n+1}$; hence by induction $X↓n=(\sqrt D-U↓n)/V↓n$, where $U↓n$
and $V↓n$ are integers.\xskip
 [Note that the identity $V↓{n+1}=A↓n(U↓{n-1}-U↓n)+V↓{n-1}$
makes it unnecessary to divide when $V↓{n+1}$ is being determined.]

(b)\9 Let $Y=(-\sqrt D-U)/V$, $Y↓n=(-\sqrt D-U↓n)/V↓n$. The stated identity
obviously holds by replacing $\sqrt D$ by $-\sqrt D$ in the proof of (a). We have
$$Y=(p↓n/Y↓n+p↓{n-1})/(q↓n/Y↓n+q↓{n-1}),$$
where $p↓n$ and $q↓n$ are defined in part (c) of this exercise; hence
$$Y↓n=(-q↓n/q↓{n-1})(Y-p↓n/q↓n)/(Y-p↓{n-1}/q↓{n-1}).$$
But by (12), $p↓{n-1}/q↓{n-1}$ and $p↓n/q↓n$
are extremely close to $X$; since $X ≠ Y$, $Y - p↓n/q↓n$ and $Y
- p↓{n-1}/q↓{n-1}$ will have the same sign as $Y - X$ for all
large $n$. This proves that $Y↓n < 0$ for all large $n$; hence
$0 < X↓n < X↓n - Y↓n = 2\sqrt{D}/V↓n$; $V↓n$ must be positive.
Also $U↓n < \sqrt{D}$, since $X↓n > 0$. Hence $V↓n < 2\sqrt D$, since
$V↓n≤A↓nV↓n<\sqrt D+U↓{n-1}$.

Finally, we want to show that $U↓n>0$. Since $X↓n<1$, we have $U↓n>\sqrt D-V↓n$,
so we need only consider the case $V↓n>\sqrt D$; then $U↓n={A↓nV↓n-U↓{n-1}}≥
{V↓n-U↓{n-1}}>{\sqrt D-U↓{n-1}}$, and this is positive as we have already
observed.

{\sl Note:} In the repeating cycle we have $\sqrt D+U↓n=A↓nV↓n+(\sqrt D-U↓{n-1})>
V↓n$; hence $\lfloor(\sqrt D+U↓{n+1})/V↓{n+1}\rfloor=\lfloor A↓{n+1}+V↓n/(\sqrt D
+U↓n)\rfloor=A↓{n+1}=\lfloor(\sqrt D+U↓n)/V↓{n+1}\rfloor$. Thus, $A↓{n+1}$ is
determined by $U↓{n+1}$ and $V↓{n+1}$; we can determine $(U↓n,V↓n)$ from
$(U↓{n+1},V↓{n+1})$ in the period. In fact, when $0<V↓n<\sqrt D+U↓n$ and
$0<U↓n<\sqrt D$, the arguments above prove that $0<V↓{n+1}<\sqrt D+U↓{n+1}$ and
$0<U↓{n+1}<\sqrt D\,$; moreover, if the pair $(U↓{n+1},V↓{n+1})$ follows $(U↑\prime,
V↑\prime)$ with $0<V↑\prime<\sqrt D+U↑\prime$ and $0<U↑\prime<\sqrt D$, then
$U↑\prime=U↓n$ and $V↑\prime=V↓n$. Hence {\sl$(U↓n,V↓n)$ is part of the cycle
if and only if\/\ $0<V↓n<\sqrt D+U↓n$ and\/\ $0<U↓n<\sqrt D$.}

\yyskip (c) \hskip40pt$\dispstyle{-V↓{n+1}\over V↓n} = X↓nY↓n
 = {(q↓nX - p↓n)(q↓nY - p↓n)\over
(q↓{n-1}X - p↓{n-1})(q↓{n-1}Y - p↓{n-1})}.$

\yyskip\noindent There is also a companion identity, namely
$$Vp↓np↓{n-1} + U(p↓nq↓{n-1} + p↓{n-1}q↓n) + \biglp (U↑2 -
D)/V\bigrp q↓nq↓{n-1} = (-1)↑nU↓n.$$

(d)\9 If $X↓n = X↓m$ for some $n ≠ m$, then $X$
is an irrational number that satisfies the quadratic equation
$(q↓nX - p↓n)/(q↓{n-1}X - p↓{n-1}) = (q↓mX - p↓m)/(q↓{m-1}X -
p↓{m-1})$.

\ansno 14. As in exercise 9, we need only verify the stated
identities when $c$ is the last partial quotient, and this verification
is trivial. Now Hurwitz's rule gives $2/e = \bslash 1, 2, 1,
2, 0, 1, 1, 1, 1, 1, 0, 2, 3, 2, 0, 1, 1, 3, 1, 1, 0, 2, 5,
\ldotss\bslash $. Taking the reciprocal, collapsing out the
zeros as in exercise 9, and taking note of the pattern that
appears, we find (cf.\ exercise 16) that $e/2 = 1 + \bslash\,
2, \overline{2m + 1, 3, 1, 2m + 1, 1, 3}\bslash$, $m ≥ 0$.\xskip [{\sl Schriften der
phys.-\"okon.\ Gesellschaft zu K\"onigsberg \bf 32}
(1891), 59--62.]

\ansno 15. $\biglp$This procedure maintains
four integers $(A, B, C, D)$ with the invariant meaning that
``our remaining job is to output the continued fraction for
$(Ay + B)/(Cy + D)$, where $y$ is the input yet to come.''$\bigrp$\xskip
Initially set $j ← k ← 0$, $(A, B, C, D) ← (a, b, c, d)$; then
input $x↓j$ and set $(A, B, C, D) ← (Ax↓j + B, A, Cx↓j + D,
C)$, $j ← j + 1$, one or more times until $C + D$ has the same
sign as $C$.\xskip $\biglp$When $j ≥ 1$ and the input has not terminated,
we know that $1 < y < ∞$; and when $C + D$ has the same sign
as $C$ we know therefore that $(Ay + B)/(Cy + D)$ lies between
$(A + B)/(C + D)$ and $A/C$.$\bigrp$\xskip Now comes the general step:
If no integer lies strictly between $(A + B)/(C + D)$ and $A/C$,
output $X↓k ← \lfloor A/C\rfloor $, and set $(A, B, C, D) ←
(C, D, A - X↓kC, B - X↓kD)$, $k ← k + 1$; otherwise input $x↓j$
and set $(A, B, C, D) ← (Ax↓j + B, A, Cx↓j + D, C)$, $j ← j +
1$. The general step is repeated ad infinitum. However, if at
any time the {\sl final\/} $x↓j$ is input, the algorithm immediately
switches gears: It outputs the continued fraction for $(Ax↓j
+ B)/(Cx↓j + D)$, using Euclid's algorithm, and terminates.

The following tableau shows the working for the
requested example, where the matrix $\left({B\atop D}\,{A\atop C}\right)$
begins at the upper left corner, then shifts right one on
input, down one on output:
$$\vbox{\baselineskip0pt\lineskip0pt
\def\\{\vrule depth 2.5pt height 8.5pt}
\def\¬{\vrule height 3pt}
\halign{\hfill#⊗\hbox to 24.2pt{$\hfill#$\9}⊗#⊗\!
\hbox to 30pt{$\hfill#$\9}⊗\!
\hbox to 30pt{$\hfill#$\9}⊗\!
\hbox to 30pt{$\hfill#$\9}⊗\!
\hbox to 30pt{$\hfill#$\9}⊗\!
\hbox to 30pt{$\hfill#$\9}⊗\!
\hbox to 30pt{$\hfill#$\9}⊗\!
\hbox to 30pt{$\hfill#$\9}⊗\!
\hbox to 30pt{$\hfill#$\9}⊗\!
\hbox to 30pt{$\hfill#$\9}⊗\!
\hbox to 34.6pt{$\hfill#$\quad}⊗#\cr
\noalign{\moveright 25pt\hbox to 305pt{\leaders\hrule\hfill}}
⊗⊗\¬⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗\¬\cr
⊗⊗\\⊗x↓j⊗-1⊗5⊗1⊗1⊗1⊗2⊗1⊗2⊗∞⊗\\\cr
⊗⊗\¬⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗\¬\cr
\noalign{\hrule}
\¬⊗⊗\¬⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗\¬\cr
\\⊗X↓k⊗\\⊗39⊗97⊗-58⊗-193⊗⊗⊗⊗⊗⊗⊗\\\cr
\\⊗-2⊗\\⊗-25⊗-62⊗37⊗123⊗⊗⊗⊗⊗⊗⊗\\\cr
\\⊗2⊗\\⊗⊗⊗16⊗53⊗⊗⊗⊗⊗⊗⊗\\\cr
\\⊗3⊗\\⊗⊗⊗5⊗17⊗22⊗39⊗⊗⊗⊗⊗\\\cr
\\⊗7⊗\\⊗⊗⊗1⊗2⊗3⊗5⊗8⊗⊗⊗⊗\\\cr
\\⊗1⊗\\⊗⊗⊗⊗⊗1⊗4⊗5⊗14⊗⊗⊗\\\cr
\\⊗1⊗\\⊗⊗⊗⊗⊗⊗1⊗3⊗7⊗⊗⊗\\\cr
\\⊗1⊗\\⊗⊗⊗⊗⊗⊗⊗2⊗7⊗9⊗25⊗\\\cr
\\⊗13⊗\\⊗⊗⊗⊗⊗⊗⊗1⊗0⊗1⊗2⊗\\\cr
\\⊗2⊗\\⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗1⊗\\\cr
\\⊗∞⊗\\⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗0⊗\\\cr
\¬⊗⊗\¬⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗⊗\¬\cr}
\hrule}$$
M. Mend\`es France has shown
that the number of quotients output per quotient input is asymptotically
bounded between $1/r$ and $r$, where $r = 2\lfloor K(|ad - bc|)/2\rfloor
+ 1$ and $K$ is the function defined in exercise 38; this bound
is best possible.\xskip [{\sl Topics in Number Theory}, ed.\ by P. Tur\'an,
{\sl Colloquia Math. Soc. J\'anos Bolyai \bf13} (1976), 183--194.]

The above algorithm can be generalized to compute
the continued fraction for $(axy + bx + cy + d)/(Axy + Bx +
Cy + D)$ from those of $x$ and $y$ (in particular, to compute
sums and products); see R. W. Gosper, to appear.
%folio 781 galley 8 (C) Addison-Wesley 1978	*
\def\bslash{\char'477 } \def\vbslash{\char'477017 } % boldface slashes (vol. 2 only)
\ansno 16. It is not difficult to prove by induction that $f↓n(z)
= z/(2n + 1) + O(z↑3)$ is an odd function with a convergent
power series in a neighborhood of the origin, and that it satisfies
the given differential equation. Hence
$$f↓0(z) = \bslash z↑{-1} + f↓1(z)\bslash = \cdots = \bslash
z↑{-1}, 3z↑{-1}, \ldotss , (2n + 1)z↑{-1} + f↓{n+1}(z)\bslash.$$
It remains to prove that $\lim↓{n→∞}\bslash z↑{-1},
3z↑{-1}, \ldotss , (2n + 1)z↑{-1}\bslash = f↓0(z)$.\xskip [Actually
Euler, age 24, obtained continued fraction expansions for the
considerably more general differential equation $f↑\prime↓{\!n}(z)
= az↑m + bf↓n(z)z↑{m-1} + cf↓n(z)↑2$; but he did not bother
to prove convergence, since formal manipulation and intuition
were good enough in the eighteenth century.]

There are several ways to prove the desired limiting
equation. First, letting $f↓n(z) = \sum ↓k a↓{nk}z↑k$, we can
argue from the equation
$$\twoline{(2n + 1)a↓{n1} + (2n + 3)a↓{n3}z↑2 + (2n + 5)a↓{n5}z↑4
+\cdots}{0pt}{= 1 - (a↓1z + a↓3z↑3 + a↓5z↑5 +\cdotss)↑2}$$
that $(-1)↑ka↓{n(2k+1)}$ is a sum of terms of
the form $c↓k/(2n + 1)↑{k+1}(2n + b↓{k1}) \ldotsm (2n + b↓{kk})$,
where the $c↓k$ and $b↓{km}$ are positive integers independent
of $n$. For example, $-a↓{n7} = 4/(2n + 1)↑4(2n + 3)(2n + 5)(2n
+ 7) + 1/(2n + 1)↑4(2n + 3)↑2(2n + 7)$. Thus $|a↓{(n+1)k}| ≤
|a↓{nk}|$, and $|f↓n(z)| ≤ \tan |z|$ for $|z| < π/2$. This
uniform bound on $f↓n(z)$ makes the convergence proof very simple.
Careful study of this argument reveals that the power series
for $f↓n(z)$ actually converges for $|z| < π\sqrt{2n + 1}/2$;
this is interesting, since it shows that the singularities of
$f↓n(z)$ get farther and farther away from the origin as $n$
grows, so the continued fraction actually represents $\hbox{tanh}\,z$
{\sl throughout} the complex plane.

Another proof gives further information of a different
kind: If we let
$$A↓n(z) = n! \sum ↓{0≤k≤n}{2n-k\choose n}\,z↑k/k!
= \sum ↓{k≥0} {(n + k)!\,z↑{n-k}\over k!\,(n - k)!} ,$$
then
$$\eqalign{A↓{n+1}(z)⊗= \sum ↓{k≥0} {(n + k - 1)!\,\biglp (4n + 2)k +
(n + 1 - k)(n - k)\bigrp\over k!\,(n + 1 - k)!}\,z↑
{n+1-k}\cr⊗= (4n + 2)A↓n(z)+ z↑2A↓{n-1}(z).\cr}$$
It follows, by induction, that
$$\baselineskip29pt
\eqalign{Q↓n\left({1\over z}, {3\over z} , \ldotss , {2n - 1\over z}\right)
⊗ = {A↓n(2z) + A↓n(-2z)\over 2↑{n+1}z↑n},\cr
Q↓{n-1}\left({3\over z} , \ldotss , {2n - 1\over z}\right) ⊗= {A↓n(2z)
- A↓n(-2z)\over 2↑{n+1}z↑n} .\cr}$$
Hence
$$\bslash z↑{-1}, 3z↑{-1}, \ldotss , (2n - 1)z↑{-1}\bslash
= {A↓n(2z) - A↓n(-2z)\over A↓n(2z) + A↓n(-2z)} ,$$
and we want to show that this ratio approaches
$\hbox{tanh}\,z$. By Eqs.\ 1.2.9--11 and 1.2.6--24,
$$e↑zA↓n(-z) = n! \sum ↓{m≥0} z↑m\,\biggglp \sum
↓{0≤k≤n} {m\choose k}{2n - k\choose n}(-1)↑k\,\bigggrp
=\sum ↓{m≥0}{2n - m\choose n}\,z↑m\,{n!\over m!}.$$
Hence
$$e↑zA↓n(-z) - A↓n(z) = R↓n(z) = (-1)↑nx↑{2n+1} \sum ↓{k≥0}
{(n + k)!\,x↑k\over (2n + k + 1)!\,k!} .$$
We now have $(e↑{2z} - 1)\biglp A↓n(2z)
+ A↓n(-2z)\bigrp - (e↑{2z} + 1)\biglp A↓n(2z) - A↓n(-2z)\bigrp
= 2R↓n(2z)$; hence
$$\hbox{tanh}\,z-\bslash z↑{-1}, 3z↑{-1}, \ldotss , (2n - 1)z↑{-1}\bslash
= {2R↓n(2z)\over \biglp A↓n(2z) + A↓n(-2z)\bigrp (e↑{2z} + 1)}.$$
Thus we have an exact formula for the
difference. When $|z| ≤ 1$, the factor $e↑{2z} + 1$ is bounded away
from zero, $|R↓n(2z)| ≤ en!/(2n + 1)!$, and
$$\eqalign{\textstyle{1\over2}|A↓n(2z)+A↓n(-2z)|
⊗≥n!\,\biggglp{2n\choose n}-{2n-2\choose n}
-{2n-4\choose n}-\cdots\bigggrp\cr
\noalign{\vskip3pt}
⊗≥ {(2n)!\over n!}{\textstyle(1 - {1\over 4} - {1\over
16} -\cdotss)}= {2\over 3} {(2n)!\over n!}.\cr}$$
Thus convergence is very rapid, even for complex values of $z$.

To go from this continued fraction to the continued
fraction for $e↑z$, we have $\hbox{tanh}\,z = 1 - 2/(e↑{2z} + 1)$; hence
we get the continued-fraction representation for $(e↑{2z} +
1)/2$ by simple manipulations. Hurwitz's rule gives the expansion
of $e↑{2z} + 1$, from which we may subtract unity. For $n$ odd,
$$e↑{-2/n} = \bslash\,\overline{1, 3mn + \lfloor n/2\rfloor
, (12m + 6)n, (3m + 2)n + \lfloor n/2\rfloor , 1}\bslash ,\qquad
m ≥ 0.$$

Another derivation has been given by C. S. Davis,
{\sl J. London Math.\ Soc.\ \bf 20} (1945), 194--198.

\ansno 17. (b)\9 $\bslash x↓1 - 1, 1, x↓2
- 2, 1, x↓3 - 2, 1, \ldotss , 1, x↓{2n-1} - 2, 1, x↓{2n} - 1\bslash
$.

(c)\9
$1 + \bslash 1, 1, 3, 1, 5, 1,\ldotss\bslash=1+\bslash\overline{2m+1,1}\bslash$,
\quad $m≥0$.

\ansno 19. The sum for $1 ≤ k ≤ N$ is $\log↓b \biglp (1+x)(N+1)/(N+1+x)\bigrp$.

\ansno 20. Let $H = SG$, $g(x) = (1 + x)G↑\prime
(x)$, $h(x) = (1 + x)H↑\prime (x)$. Then (35) implies that $h(x
+ 1)/(x + 2) - h(x)/(x + 1) = -(1 + x)↑{-2}g\biglp1/(1 +x)\bigrp
/\biglp 1 + 1/(1 + x)\bigrp $.

\ansno 21. $\varphi (x) = c/(cx + 1)↑2 + (2 - c)/\biglp (c -
1)x + 1\bigrp ↑2$, $U\varphi (x) = 1/(x + c)↑2$. When $c ≤ 1$,
the minimum of $\varphi (x)/U\varphi (x)$ occurs at $x = 0$
and is $2c↑2 ≤ 2$. When $c ≥ \phi = {1\over 2}(\sqrt{5} + 1)$,
the minimum occurs at $x = 1$ and is $≤\phi ↑2$. When $c \approx
1.31266$ the values at $x = 0$ and $x = 1$ are nearly equal
and the minimum is $>3.2$; the bounds $(0.29)↑n\varphi ≤ U↑n\varphi
≤ (0.31)↑n\varphi$ are obtained. Still better bounds come from well-chosen
linear combinations $Tg(x) = \sum a↓j/(x + c↓j)$.

\ansno 23. By the interpolation formula of exercise 4.64--15
with $x↓0 = 0$, $x↓1 = x$, $x↓2 = x + ε$, letting $ε → 0$, we have
the general identity $R↑\prime↓{n}(x) = \biglp R↓n(x) - R↓n(0)\bigrp
/x + {1\over 2}xR↓n\biglp \theta (x)\bigrp$ for some $\theta
(x)$ between 0 and $x$, whenever $R↓n$ is a function with continuous
second derivative. Hence in this case $R↑\prime↓{n}(x) =
O(2↑{-n})$.

\ansno 24. $∞$.\xskip [A. Khinchin, in {\sl Compos.\ Math.\ \bf 1} (1935),
361--382, proved that the sum $A↓1 +\cdots + A↓n$
of the first $n$ partial quotients of a real number $X$ will
be asymptotically $n\lg n$, for almost all $X$.]
%folio 784 galley 9 (C) Addison-Wesley 1978	*
\def\bslash{\char'477 } \def\vbslash{\char'477017 } % boldface slashes (vol. 2 only)
\ansno 25. Any union of intervals can be written as a union
of disjoint intervals, since $\union↓{k≥1} I↓k = \union↓{k≥1}\biglp I↓k \rslash
\union↓{1≤j<k}
I↓j\bigrp$, and this is a disjoint union in which $I↓k\rslash\union↓{1≤j<k}
I↓j$ can be expressed as a finite union of disjoint intervals.
Therefore we may take $\Iscr = \union I↓k$, where $I↓k$ is an interval
of length $ε/2↑k$ containing the $k$th rational number in $[0,
1]$, using some enumeration of the rationals. In this case $\mu(
\Iscr)≤ ε$, but $\|\Iscr∩ P↓n\| = n$ for all $n$.

\ansno 26. The continued fractions $\bslash A↓1, \ldotss , A↓t\bslash$
that appear are precisely those for which ${A↓1 > 1}$, ${A↓t > 1}$,
and $Q↓t(A↓1, A↓2, \ldotss , A↓t)$ is a divisor of $n$. Therefore
(6) completes the proof.\xskip $\biglp${\sl Note:} If $m↓1/n
= \bslash A↓1, \ldotss , A↓t\bslash$ and $m↓2/n = \bslash
A↓t, \ldotss , A↓1\bslash $, where $m↓1$ and $m↓2$ are relatively
prime to $n$, then $m↓1m↓2 ≡ \pm 1\modulo n$; this rule
defines the correspondence. When $A↓1 = 1$ an analogous symmetry
is valid, according to (38).$\bigrp$

\ansno 27. First prove the result for
$n = p↑e$, then for $n = rs$, where $r$ and $s$ are relatively
prime. Alternatively, use the formulas in the next exercise.

\ansno 28. (a)\9 The left-hand side is multiplicative
(see exercise 1.2.4--31), and it is easily evaluated when $n$
is a power of a prime.\xskip (c) From (a), we have {\sl M\"obius's
inversion formula\/}: If $f(n) = \sum ↓{d\rslash n} g(d)$,
then $g(n) = \sum ↓{d\rslash n} \mu (n/d)f(d)$.

\ansno 29. The sum is approximately $\biglp\biglp(12\ln 2)/π↑2\bigrp\ln N!\bigrp/N
- \sum ↓{d≥1}\Lambdait(d)/d↑2 + 1.47$; here $\sum ↓{d≥1}\Lambdait
(d)/d↑2$ converges to the constant value $-\zeta ↑\prime
(2)/\zeta (2)$, while $\ln N! = N \ln N - N + O(\log N)$ by
Stirling's approximation.

\ansno 30. The modified algorithm affects the calculation if
and only if the following division step in the unmodified algorithm
would have the quotient 1, and in this case it avoids the following
division step. The probability that a given division step is
avoided is the probability that $A↓k = 1$ and that this quotient
is preceded by an even number of quotients equal to 1. By the
symmetry condition, this is the probability that $A↓k = 1$ and
is {\sl followed} by an even number of quotients equal to 1.
The latter happens if and only if $X↓{k-1} > \phi - 1 = 0.618
\ldotss $, where $\phi$ is the golden ratio: For $A↓k = 1$ and $A↓{k+1}
>1$ iff ${2\over 3} ≤ X↓{k-1} < 1$; $A↓k = A↓{k+1} = A↓{k+2}
= 1$ and $A↓{k+3} > 1$ iff ${5\over 8} ≤ X↓{k-1} < {2\over 3}$;
etc. Thus we save approximately $F↓{k-1}(1) - F↓{k-1}(\phi -
1) \approx 1 - \lg \phi \approx 0.306$ of the division steps.
The average number of steps is approximately $\biglp(12\ln \phi)/π↑2\bigrp\ln
n$, when $v = n$ and $u$ is relatively prime to $n$. Kronecker
[{\sl Vorlesungen \"uber Zahlentheorie \bf 1} (Leipzig:
Teubner, 1901), 118] observed that this choice of least remainder
in absolute value always gives the shortest possible number
of iterations, over all algorithms that replace $u$ by $(\pm
u)\mod v$ at each iteration. For further results see N. G.
de Bruijn and W. M. Zaring, {\sl Nieuw Archief voor Wiskunde
(3) \bf 1} (1953), 105--112; G. J. Rieger, {\sl Notices Amer.\ Math.\ Soc.\
\bf 22} (1975), A-616; {\bf23} (1976), A-474.

On many computers, the modified algorithm makes
each division step longer; the idea of exercise 1, which saves
{\sl all} division steps when the quotient is unity, would be
preferable in such cases.

\ansno 31. Let $a↓0 = 0$, $a↓1 = 1$, $a↓{n+1} = 2a↓n + a↓{n-1}$;
then $a↓n = \biglp (1 + \sqrt{2})↑n - (1 - \sqrt{2})↑n\bigrp
/2\sqrt{2}$, and the worst case (in the sense of Theorem F)
occurs when $u = a↓n + a↓{n-1}$, $v = a↓n$, $n ≥ 2$.

This result is due to A. Dupr\'e [{\sl
J. de Math.\ \bf 11} (1846), 41--64], who also investigated more
general ``look-ahead'' procedures suggested by J. Binet. See
P. Bachmann, {\sl Niedere Zahlentheorie \bf 1} (Leipzig: Teubner,
1902), 99--118, for a discussion of early analyses of Euclid's
algorithm.

\ansno 32. (b)\9 $Q↓{m-1}(x↓1, \ldotss , x↓{m-1})Q↓{n-1}(x↓{m+2}, \ldotss
, x↓{m+n})$ corresponds to Morse code sequence of length $(m
+ n)$ in which a dash occupies positions $m$ and $(m + 1)$;
the other term corresponds to the opposite case.\xskip (Alternatively,
use exercise 2. The more general identity
$$\twoline{Q↓{m+n}(x↓1, \ldotss , x↓{m+n})Q↓k(x↓{m+1}, \ldotss , x↓{m+k})
= Q↓{m+k}(x↓1, \ldotss , x↓{m+k})Q↓n(x↓{m+1}, \ldotss , x↓{m+n})}{2pt}{\null
+ (-1)↑kQ↓{m-1}(x↓1, \ldotss , x↓{m-1})Q↓{n-k-1}(x↓{m+k+2}, \ldotss
, x↓{m+n}).}$$ also appeared in Euler's paper.)

\ansno 33. (a)\9 The new representations are $x = m/d$, $y = (n
- m)/d$, $x↑\prime = y↑\prime = d = \gcd(m, n - m)$, for ${1\over
2}n < m < n$.\xskip (b) The relation $(n/x↑\prime ) - y ≤ x < n/x↑\prime$
defines $x$.\xskip (c) Count the $x↑\prime$ satisfying (b).\xskip (d) A
pair of integers $x > y > 0$, $\gcd(x, y)= 1$, can be uniquely
written in the form $x = Q↓m(x↓1, \ldotss , x↓m)$, $y = Q↓{m-1}(x↓1,
\ldotss , x↓{m-1})$, where ${x↓1 ≥ 2}$ and ${m ≥ 1}$; here $y/x = \bslash
x↓m, \ldotss , x↓1\bslash $.\xskip (e) It suffices to show that $\sum
↓{1≤k≤n/2}T(k, n) = 2\lfloor n/2\rfloor + h(n)$. For $1 ≤ k
≤ n/2$ the continued fractions $k/n = \bslash x↓1, \ldotss ,
x↓m\bslash$ run through all sequences $(x↓1,\ldotss,x↓m)$ such that
$m≥1$, $x↓1≥2$, $x↓m≥2$, $Q↓m(x↓1,\ldotss,x↓m)\rslash n$; and $T(k,n)=2 + (m - 1)$.

\ansno 34. (a)\9 Dividing $x$ and $y$ by $\gcd(x,
y)$ yields $g(n) = \sum ↓{d\rslash n} h(n/d)$; apply exercise
28(c), and use the symmetry between primed and unprimed variables.\xskip
(b) For fixed $y$ and $t$, the representations with $xd ≥ x↑\prime$
have $x↑\prime < \sqrt{nd}$; hence there are $O(\sqrt{nd}/y)$
such representations. Now sum for $0 < t ≤ y < \sqrt{n/d}$.\xskip
(c) If $s(y)$ is the given sum, then $\sum ↓{d\rslash y} s(d)
= y(H↓{2y} - H↓y) = k(y)$, say; hence $s(y) = \sum ↓{d\rslash
y} k(y/d)$. Now $k(y) = y \ln 2 - {1\over 4} + O(1/y)$.\xskip (d)
$\sum ↓{1≤y≤n} \varphi (y)/y↑2 = \sum ↓{1≤y≤n,\,d\rslash y} \mu
(d)/yd = \sum ↓{cd≤n} \mu (d)/cd↑2$.\xskip $\biglp$Similarly, $\sum ↓{1≤y≤n}
\sigma ↓{-1}(y)/y↑2 = O(1).\bigrp$\xskip (e) $\sum ↓{1≤k≤n} \mu (k)/k↑2
= 6/π↑2 + O(1/n)$ $\biglp$see exercise 4.5.2--10(d)$\bigrp$; and $\sum ↓{1≤k≤n}
\mu (k)\log k/k↑2 = O(1)$. Hence $h↓d(n) = n\biglp(3 \ln 2)/π↑2\bigrp\*
\ln(n/d) + O(n)$ for $d ≥ 1$. So $h(n) = 2 \sum ↓{cd\rslash
n} \mu (d)h↓c(n/cd) = \biglp (6 \ln 2)/π↑2\bigrp n\biglp \ln n -
\sum - \sum ↑\prime \bigrp+ O\biglp n\sigma ↓{-1}(n)↑2\bigrp $, where
the remaining sums are
$\sum = \sum ↓{cd\rslash n} \mu (d)\ln(cd)/cd = 0$ and $\sum
↑\prime = \sum ↓{cd\rslash n} \mu (d)\ln c/cd = \sum ↓{d\rslash
n}\Lambdait(d)/d$.\xskip [It is well known that $\sigma ↓{-1}(n) =
O(\log\log n)$; cf.\ Hardy and Wright, {\sl Theory of Numbers},
$\section$22.9.]

\ansno 35. See {\sl Proc.\ Nat.\ Acad.\ Sci.\ \bf72} (1975), 4720--4722.

\ansno 36. Working the algorithm backwards, we
want to choose $k↓1$, $\ldotss$, $k↓{n-1}$ so that $u↓k ≡ F↓{k↓1}
\ldotsm F↓{k↓{i-1}}F↓{k↓i}-1$ $\biglp$
modulo $\gcd(u↓{i+1}, \ldotss , u↓n)\bigrp$ for $1 ≤i< n$, with
$u↓n = F↓{k↓1} \ldotsm F↓{k↓{n-1}}$ a
minimum, where the $k$'s are positive, $k↓1 ≥ 3$, and $k↓1 +\cdots
+ k↓{n-1} = N + n - 1$. The solution is $k↓2 =\cdots
= k↓{n-1} = 2$, $u↓n = F↓{N-n+3}$.\xskip [See {\sl CACM
\bf 13} (1970), 433--436, 447--448.]

\ansno 37. See {\sl Proc.\ Amer.\ Math.\
Soc.\ \bf 7} (1956), 1014--1021; cf.\ also exercise 6.1--18.

\ansno 38. Let $m = \lceil n/\phi \rceil $, so that $m/n = \phi
↑{-1} + ε = \bslash x↓1, x↓2, \ldotss \bslash$ where $0 < ε
< 1/n$. Let $k$ be minimal such that $x↓k ≥ 2$; then $\biglp
\phi ↑{1-k} + (-1)↑kF↓{k-1}ε\bigrp\hbox{\:a/}\biglp \phi ↑{-k} - (-1)↑kF↓kε\bigrp
≥ 2$, hence $k$ is even and $\phi ↑{-2} = 2 - \phi ≤ \phi ↑kF↓{k+2}ε
= (\phi ↑{2k+2} - \phi ↑{-2})ε/\sqrt{5}$.\xskip [{\sl Ann.\ Polon.\ Math.\
\bf 1} (1954), 203--206.]

\ansno 39. At least 287 at bats; $\bslash
\,2, 1, 95\bslash = 96/287 = .33449477 \ldotss$, and no fraction
with denominator $<287$ lies in the interval $[.3335, .3345] = [\bslash\, 2,
1, 666\bslash ,\, \bslash\, 2, 1, 94, 1, 1, 3\bslash ]$.

To solve the general question of the fraction
in $[a, b]$ with smallest denominator, where $0 < a < b < 1$,
note that in terms of regular continued-fraction representations
we have $\bslash x↓1, x↓2,\ldotss\bslash < \bslash y↓1, y↓2,
\ldotss\bslash$ iff $(-1)↑jx↓j<(-1)↑jy↓j$ for the smallest $j$ with $x↓j
≠ y↓j$, where we place ``$∞$'' after the last partial quotient
of a rational number. Thus if $a = \bslash x↓1, x↓2,\ldotss\bslash$
and $b = \bslash y↓1, y↓2,\ldotss\bslash $, and if $j$ is minimal with
$x↓j≠y↓j$, the fractions in
$[a, b]$ have the form $c = \bslash x↓1, \ldotss , x↓{j-1},
z↓j, \ldotss , z↓m\bslash$ where $\bslash z↓j, \ldotss , z↓m\bslash$
lies between $\bslash x↓j, x↓{j+1},\ldotss\bslash$ and $\bslash
y↓j, y↓{j+1},\ldotss\bslash$ inclusive. Let $Q↓{-1} = 0$. The
denominator $$Q↓{j-1}(x↓1, \ldotss , x↓{j-1})Q↓{m-j+1}(z↓j, \ldotss
, z↓m) + Q↓{j-2}(x↓1, \ldotss , x↓{j-2})Q↓{m-j}(z↓{j+1}, \ldotss
, z↓m)$$ of $c$ is minimized when $m = j$ and $z↓j = (j$ odd
$\→ y↓j + 1 - \delta ↓{y↓{j+1}∞};\,x↓j+1-\delta↓{x↓{j+1}∞})$.\xskip
[Another way to derive this method comes from the theory in the
following exercise.]

\ansno 40. One can prove by induction that $p↓rq↓l-p↓lq↓r=1$ at each node,
hence $p↓l$ and $q↓l$ are relatively prime. Since $p/q<p↑\prime/q↑\prime$
implies that $p/q<(p+p↑\prime)/(q+q↑\prime)<p↑\prime/q↑\prime$, it is
also clear that the labels on all left descendants of $p/q$ are less than
$p/q$, while the labels on all its right descendants are greater. Therefore
each rational number occurs at most once as a label.

It remains to show that each rational does appear. If $p/q=\bslash a↓1,
\ldotss,a↓r,1\bslash$, where each $a↓i$ is a positive integer, one can show
by induction that the node labeled $p/q$ is found by going left $a↓1$ times, then
right $a↓2$ times, then left $a↓3$ times, etc.

[Peirce communicated this construction in a letter dated July 17, 1903, but
he never published it; and during the next few years he occasionally
amused himself by making rather cryptic remarks about it without revealing the
underlying mechanism. See C. S. Peirce, {\sl The New Elements of Mathematics
\bf3} (The Hague: Mouton, 1976), 781--784, 826--829; also {\bf1}, 207--211;
and his {\sl Collected Papers \bf4} (1933), 276--280.]

\ansno 41. (We assume that $x>0$.)\xskip Apply Euclid's algorithm to get a
continued fraction $x=\bslash a↓1,a↓2,\ldotss\bslash$; here $a↓1$ may equal 0,
but $a↓2$, $a↓3$, $\ldots$ must be $≥1$. Let $p↓0=1$, $q↓0=0$, $p↓1=0$,
$q↓1=1$, $p↓{j+1}=a↓jp↓j+p↓{j-1}$, $q↓{j+1}=a↓jq↓j+q↓{j-1}$, and stop
at the largest $j$ such that $p↓j/q↓j$ is representable. If $p↓j/q↓j≠x$, find
the largest $b≥0$ such that $(bp↓j+p↓{j-1})/(bq↓j+q↓{j-1})=p/q$ is
representable. By the theory of the Pierce tree it follows that $p↓j/q↓j$ is
the nearest representable number on one side of $x$ and $p/q$ is the nearest
representable number on the other side; all other representable numbers are
further away because of the nature of symmetric order. Therefore
round$(x)$ is $p↓j/q↓j$ or $p/q$, whichever is closer.\xskip (It frequently happens
that $b=0$; in this case we have $p/q=p↓{j-1}/q↓{j-1}$, so $p↓j/q↓j$ will
automatically be closer.)

\ansno 42. See M. S. Waterman, {\sl BIT \bf17} (1977), 465--478.
%folio 790 galley 10 (C) Addison-Wesley 1978	*
\ansbegin{4.5.4}

\ansno 1. If $d↓k$ isn't prime, its
prime factors are cast out before $d↓k$ is tried.

\ansno 2. No; the algorithm would fail if $p↓{t-1} = p↓t$, giving
``1'' as a spurious prime factor.

\ansno 3. Let $P$ be the product of the first 168 primes.\xskip ({\sl
Note:} Although $P$ is a large number, it is considerably
faster on many computers to calculate this gcd than to do the
168 divisions, if we just want to test whether or not $n$ is
prime.)

\ansno 4. In the notation of exercise 3.1--11,$$\sum↓{\mu,\,λ}2↑{\lceil\lg
\max(\mu+1,λ)\rceil}P(\mu,λ)={1\over m}\sum↓{l≥1}f(l)\prod↓{1≤k<l}
\bigglp1-{k\over m}\biggrp,$$
where $f(l)=\sum↓{1≤λ≤l}2↑{\lceil\lg\max(l-λ,λ)\rceil}$.
If $l=2↑{k+\theta}$, where $0<\theta≤1$, we have $f(l)=l↑2(3\cdot2↑{-\theta}-2\cdot
2↑{-2\theta})$, where the function $3\cdot2↑{-\theta}-2\cdot2↑{-2\theta}$
reaches a maximum of $9\over8$ at $\theta=\lg(4/3)$ and has a minimum of 1 at
$\theta=0$ and 1. Therefore the average value of $2↑{\lceil\lg\max(\mu+1,λ)\rceil}$
lies between 1.0 and 1.125 times the average value of $\mu+λ$, and the result
follows.

$\biglp$Algorithm B is a refinement of Pollard's original algorithm, which was based
on exercise 3.1-6(b) instead of the (yet undiscovered) result in exercise 3.1--7.
He showed that the least $n$ such that $X↓{2n}=X↓n$ has average value $~(π↑2/12)
Q(m)$; this constant $π↑2/12$ is explained by Eq.\ 4.5.3-21. Hence the average
value of $3n$ in his original method is $~(π/2)↑{5/2}\sqrt m=3.092\sqrt m$.\xskip
Richard Brent observes that, as $m→∞$, the density $\prod↓{1≤k<l}\biglp1-k/m
\bigrp=\exp\biglp-l(l-1)/2m+O(l↑3/m↑2)\bigrp$ approaches a normal distribution,
and we may assume that $\theta$ is uniformly distributed. Then $3\cdot2↑{-\theta}
-2\cdot2↑{-2\theta}$ takes the average value $3/(4\ln2)$, and the average
number of iterations needed by Algorithm B comes to $~\biglp3/(4\ln2)+{1\over2}
\bigrp\sqrt{πm/2}=1.983\sqrt m$. A similar analysis of the more general method
in the answer to exercise 3.1--7 gives $~1.926\sqrt m$, when $p=2.4771$ is
chosen ``optimally'' as the root of $(p↑2-1)\ln p=p↑2-p+1$.$\bigrp$

\ansno 5. $x\mod 3 = 0$; $x \mod 5 = 0$,
1, 4; $x \mod 7 = 0$, 1, 6; $x \mod 8 = 1$, 3, 5, 7; $x > 103$.
The first try is $x = 105$; and $(105)↑2 - 10541 = 484 = 22↑2$. This
would also have been found by Algorithm C in a relatively short
time. Thus $10541 = 83 \cdot 127$.

\ansno 6. Let us count the number of solutions
$(x, y)$ of the congruence $N ≡ (x - y)(x + y)\modulo p$,
where $0 ≤ x, y < p$. Since $N \neqv 0$ and $p$ is prime,
$x + y \neqv 0$. For each $v \neqv 0$ there is a unique
$u$ (modulo $p$) such that $N ≡ uv$. The congruences $x - y
≡ u$, $x + y ≡ v$ now uniquely determine $x\mod p$ and $y\mod
p$, since $p$ is odd. Thus the stated congruence has exactly
$p - 1$ solutions $(x, y)$. If $(x, y)$ is a solution, so is
$(x, p - y)$ if $y ≠ 0$, since $(p - y)↑2 ≡ y↑2$; and if $(x,
y↓1)$ and $(x, y↓2)$ are solutions with $y↓1 ≠ y↓2$, we have
$y↑{2}↓{1} ≡ y↑{2}↓{2}$; hence $y↓1 = p - y↓2$. Thus the number
of different $x$ values among the solutions $(x, y)$ is $(p
- 1)/2$ if $N ≡ x↑2$ has no solutions, or $(p + 1)/2$ if $N
≡ x↑2$ has solutions.

\ansno 7. One procedure is to keep two indices for each modulus,
one for the current word position and one for the current bit
position; loading two words of the table and doing an indexed
shift command will bring the table entries into proper alignment.\xskip
(Many computers have special facilities for such bit manipulation.)

\ansno 8. (We may assume that $N = 2M$
is even.)\xskip The following algorithm uses an auxiliary table $X[1]$,
$X[2]$, $\ldotss$, $X[M]$, where $X[k]$ represents the primality
of $2k + 1$.

\def\\#1. {\yskip\noindent\hangindent 38pt\hbox to 38pt{\hfill\bf#1. }}
\\S1. Set $X[k] ← 1$
for $1 ≤ k ≤ M$. Also set $j ← 1$, $p ← 1$, $p ← 3$, $q ← 4$.\xskip (During
this algorithm $p = 2j + 1$, $q = 2j + 2j↑2$; the integer variables
$j$, $k$, $p$, $q$ may readily be manipulated in index registers.)

\\S2. If $X[j] = 0$, go to S4.
Otherwise output $p$, which is prime, and set $k ← q$.

\\S3. If $k ≤ M$, then set $X[k] ← 0$, $k ← k + p$, and repeat this step.

\\S4. Set $j ← j + 1$, $p ← p + 2$, $q
← q + 2p - 2$. If $j ≤ M$, return to S2.\quad\blackslug

\yskip\noindent A major part of this calculation
could be made noticeably faster if $q$ (instead of $j$) were
tested against $M$ in step S4, and if a new loop were appended
that outputs $2j + 1$ for all remaining $X[j]$ that equal
1, suppressing the manipulation of $p$ and $q$.

Further discussion of sieve methods for generating primes appears in
exercise 5.2.3--15 and in Section 7.1.

\ansno 9. If $p↑2$ is a divisor of $n$ for some
prime $p$, then $p$ is a divisor of $λ(n)$, but not of $n -
1$. If $n = p↓1p↓2$, where $p↓1 < p↓2$ are primes, then $p↓2
- 1$ is a divisor of $λ(n)$ and therefore $p↓1p↓2 - 1 ≡ 0
\modulo{p↓2 - 1}$. Since $p↓2 ≡ 1$, this means $p↓1 - 1$ is a multiple
of $p↓2 - 1$, contradicting the assumption $p↓1 < p↓2$.\xskip $\biglp$Values
of $n$ for which $λ(n)$ properly divides $n - 1$ are called
``Carmichael numbers.'' For example, here are some small Carmichael numbers with
up to six prime factors: $3\cdot11\cdot17$, $5\cdot13\cdot17$, $7\cdot11\cdot13
\cdot41$, $5\cdot7\cdot17\cdot19\cdot73$, $5\cdot7\cdot17\cdot73\cdot89\cdot107
$.$\bigrp$

 \ansno 10. Let $k↓p$ be the order of $x↓p$ modulo $n$, and let
$λ$ be the least common multiple of all the $k↓p$'s. Then $λ$
is a divisor of $n - 1$ but not of any $(n - 1)/p$, so $λ =
n - 1$. Since $x↑{\varphi (n)}↓{p} \mod n = 1$, $\varphi (n)$
is a multiple of $k↓p$ for all $p$, so $\varphi (n) ≥ λ$. But
$\varphi (n) < n - 1$ when $n$ is not prime.\xskip (Another way to
carry out the proof is to construct an element $x$ of order
$n - 1$ from the $x↓p$'s, by the method of exercise 3.2.1.2--15.)

\anskip\null\penalty1000\vskip-6pt
\vbox{\halign to size{\hbox to 19pt{\hfill\bf#}\tabskip0pt plus100pt⊗
\rt{#}\tabskip10pt⊗\rt{#}⊗\rt{#}⊗\rt{#}⊗\rt{#}⊗\rt{#}⊗$\rt{#}$\tabskip 0pt plus100pt
\cr
11. ⊗$U$\hfill⊗$V$\hfill⊗$A$\hfill⊗$P$\hfill⊗$S$⊗$T$⊗\hbox{Output}\cr
\noalign{\vskip2pt}
⊗1984⊗1⊗0⊗992⊗0⊗---\cr
⊗1981⊗1981⊗1⊗992⊗1⊗1981\cr
⊗1983⊗4⊗495⊗993⊗0⊗1⊗993↑2 ≡+2↑2 \cr
⊗1983⊗991⊗2⊗98109⊗1⊗991\cr
⊗1981⊗4⊗495⊗2⊗0⊗1⊗2↑2 ≡+2↑2 \cr
⊗1984⊗1981⊗1⊗99099⊗1⊗1981\cr
⊗1984⊗1⊗1984⊗99101⊗0⊗1⊗99101↑2 ≡+2↑0 \cr}}
\yskip\noindent The factorization $199 \cdot 991$ is evident from
the first or last outputs. The shortness of the cycle, and the
appearance of the well-known number 1984, are probably just
coincidences.

\ansno 12. The following algorithm makes use of an auxiliary
$(m + 1) \times (m + 1)$ matrix of single-precision integers
$E↓{jk}$, $0 ≤ j, k ≤ m$; a single-precision vector $(b↓0, b↓1,
\ldotss , b↓m)$; and a multiple-precision vector $(x↓0, x↓1,
\ldotss , x↓m)$ with entries in the range $0 ≤ x↓k < N$.

\algstep F1. [Initialize.] Set
$b↓i ← -1$ for $0 ≤ i ≤ m$; then set $j ← 0$.

\algstep F2. [Next solution.] Get the next output
$(x, e↓0, e↓1, \ldotss , e↓m)$ produced by Algorithm E\null.\xskip (It is
convenient to regard Algorithms E and F as coroutines.)\xskip Set
$k ← 0$.

\algstep F3. [Search for odd.] If $k > m$ go to step
F5. Otherwise if $e↓k$ is even, set $k ← k + 1$ and repeat this
step.

\algstep F4. [Linear dependence?] If $b↓k ≥ 0$, then
set $i ← b↓k$, $x ← (x↓ix)\mod N$, $e↓r ← e↓r + E↓{ir}$ for $0
≤ r ≤ m$; set $k ← k + 1$ and return to F3. Otherwise set $b↓k
← j$, $x↓j ← x$, $E↓{jr} ← e↓r$ for $0 ≤ r ≤ m$; set $j ← j + 1$
and return to F2.\xskip (In the latter case we have a new linearly
independent solution, modulo 2, whose first odd component is
$e↓k$.)

\algstep F5. [Try to factor.] (Now $e↓0$, $e↓1$, $\ldotss$,
$e↓m$ are even.)\xskip Set
$$y ← \biglp (-1) ↑{e↓0/2}p↑{e↓1/2}↓{1} \ldotss p↑{e↓m/2}↓{m}\bigrp
\mod N.$$
If $x = y$ or if $x + y = N$, return
to F2. Otherwise compute $\gcd(x - y, N)$, which is a proper
factor of $N$, and terminate the algorithm.\quad\blackslug

\yyskip\noindent It can be shown that this
algorithm finds a factor, whenever one is deducible from the
given outputs of Algorithm E.

\ansno 13. $f(p, p↑2d) = 2/(p + 1) + f(p, d)/p$, since $1/(p
+ 1)$ is the probability that $A$ is a multiple of $p$.\xskip $f(p,
pd) = 1/(p + 1)$ when $d\mod p ≠ 0$.\xskip $f(2, 4k + 3) = {1\over
3}$ since $A↑2 - (4k + 3)B↑2$ cannot be a multiple of 4; $f(2,
8k + 5) = {2\over 3}$ since $A↑2 - (8k + 5)B↑2$ cannot be a
multiple of 8; $f(2, 8k + 1) = {1\over 3} + {1\over 3} + {1\over
3} + {1\over 6} + {1\over 12} +\cdots = {4\over 3}$.\xskip
$f(p, d) = \biglp2p/(p↑2 - 1), 0\bigrp$ if $d↑{(p-1)/2}\mod p =
(1, p - 1)$, respectively, for odd $p$.

\ansno 14. Since $P↑2 ≡ kNQ↑2\modulo p$ for any prime divisor
$p$ of $V$, we have $1 ≡ P↑{2(p-1)/2} ≡ (kNQ↑2)↑{(p-1)/2} ≡
(kN)↑{(p-1)/2}\modulo p$, if $P \neqv 0$.
%folio 794 galley 11a (C) Addison-Wesley 1978	*
\ansno 15. $U↓n = (a↑n - b↑n)/\sqrt{D}$, where $a = {1\over
2}(P + \sqrt{D})$, $b = {1\over 2}(P - \sqrt{D})$, $D = P↑2 - 4Q$.
Then $2↑{n-1}U↓n = \sum ↓k{n\choose 2k+1}P↑{n-2k-1}D↑k$; so
$U↓p ≡ D↑{(p-1)/2}\modulo p$ if $p$ is an odd prime. Similarly,
if $V↓n = a↑n + b↑n = U↓{n+1} - QU↓{n-1}$, then $2↑{n-1}V↓n
= \sum ↓k {n\choose 2k}P↑{n-2k}D↑k$, and $V↓p ≡ P↑n ≡ P$. Thus
if $U↓p ≡ -1$, we find that $U↓{p+1}\mod p = 0$. If $U↓p ≡
1$, we find that $(QU↓{p-1})\mod p = 0$; here if $Q$ is a multiple
of $p$, $U↓n ≡ P↑{n-1}\modulo p$ for $n > 0$, so $U↓n$ is
never a multiple of $p$; if $Q$ is not a multiple of $p$, $U↓{p-1}
\mod p = 0$. Therefore as in Theorem L\null, $U↓t\mod N = 0$ if
$N = p↑{e↓1}↓{1} \ldotss p↑{e↓r}↓{r}$, $\gcd(N, Q) = 1$, and $t
=\lcm↓{1≤j≤r}\biglp p↑{e↓j-1}↓{j}(p↓j + ε↓j)\bigrp $. Under
the assumptions of this exercise, the rank of apparition of
$N$ is $N + 1$; hence $N$ is prime to $Q$ and $t$ is a multiple
of $N + 1$. Also, the assumptions of this exercise imply that
each $p↓j$ is odd and each $ε↓j$ is $\pm 1$, so $t ≤ 2↑{1-r}\prod p↑{e↓j-1}↓{j}(p↓j
+ {1\over 3}p↓j) = 2({2\over 3})↑rN$; hence $r = 1$ and $t =
p↑{e↓1}↓{1} + ε↓1p↑{e↓1-1}↓{1}$. Finally, therefore, $e↓1 = 1$
and $ε↓1 = 1$.

{\sl Note:} If this test for primality
is to be any good, we must choose $P$ and $Q$ in such a way that
the test will probably work. Lehmer suggests taking
$P = 1$ so that $D = 1 - 4Q$, and choosing $Q$ so that $\gcd(N,
QD) = 1$.\xskip (If the latter condition fails, we know already that
$N$ is not prime, unless $|QD| ≥ N$.)\xskip Furthermore, the derivation
above shows that we will want $ε↓1 = 1$, that is, $D↑{(N-1)/2}
≡ -1\modulo N$. This is another condition that determines
the choice of $Q$. Furthermore, if $D$ satisfies this condition,
and if $U↓{N+1}\mod N ≠ 0$, we know that $N$ is {\sl not}
prime.

{\sl Example:} If $P = 1$ and $Q = -1$, we have
the Fibonacci sequence, with $D = 5$. Since $5↑{11} ≡ -1 \modulo
{23}$, we might attempt to prove that 23 is prime by using the
Fibonacci sequence:
$$\langle F↓n\mod 23\rangle = 0, 1, 1, 2, 3, 5, 8, 13, 21,
11, 9, 20, 6, 3, 9, 12, 21, 10, 8, 18, 3, 21, 1, 22, 0,\ldotss,$$
so 24 is the rank of apparition of 23
and the test works. However, the Fibonacci sequence cannot be
used in this way to prove the primality of 13 or 17, since $F↓7
\mod 13 = 0$ and $F↓9\mod 17 = 0$. When $p ≡ \pm 1\modulo{10}$,
we have $5↑{(p-1)/2}\mod p = 1$, so $F↓{p-1}$ (not $F↓{p+1}$)
is divisible by $p$.

\ansno 17. Let $f(q) = 2\lg q - 1$. When $q = 2$ or 3, the
tree has at most $f(q)$ nodes. When $q > 3$ is prime, let $q
= 1 + q↓1 \ldotsm q↓t$ where $t ≥ 2$ and $q↓1$, $\ldotss$, $q↓t$
are prime. The size of the tree is $≤1 + \sum f(q↓k) = 2 + f(q
- 1) - t < f(q)$.\xskip [{\sl SIAM J. Computing \bf 7} (1975), \hbox{214--220}.]

\ansno 18. $x\biglp G(α) - F(α)\bigrp$ is the number of $n ≤
x$ whose second-largest prime factor is $≤x↑α$ and whose largest
prime factor is $>x↑α$. Hence$$xG↑\prime (t)\,dt = \biglp π(x↑{t+dt})
- π(x↑t)\bigrp\cdot x↑{1-t}\biglp G\biglp t/(1 - t)\bigrp - F\biglp t/(1
- t)\bigrp\bigrp.$$ The probability that $p↓{t-1} ≤ \sqrt{\chop to 0pt{p↓t}}$ is
$\int ↑{1}↓{0} F\biglp t/2(1 - t)\bigrp t↑{-1}\,dt$.\xskip
[Curiously, it can be shown that this
also equals $\int ↑{1}↓{0} F\biglp t/(1 - t)\bigrp\,dt$, the
average value of $\log p↓t/\!\log x$, and it also equals Golomb's constant $λ$
in exercise 1.3.3--23. The derivative $G↑\prime
(0)$ can be shown to equal $\int ↑{1}↓{0} F\biglp t/(1 - t)\bigrp
t↑{-2}\,dt = F(1) + 2F({1\over 2}) + 3F({1\over 3}) +\cdots
= e↑\gamma$. The third-largest prime factor has $H(α
) = \int ↑{α}↓{0} \biglp H\biglp t/(1 - t)\bigrp - G\biglp t/(1 - t)\bigrp\bigrp
t↑{-1}\,dt$ and $H↑\prime(0) = ∞$. See D. E. Knuth and L. Trabb Pardo,
{\sl Theoretical Comp.\ Sci.\ \bf3} (1976), 321--348.]

\ansno 19. $M = 2↑D - 1$ is a multiple of all $p$ for which
the order of 2 modulo $p$ divides $D$. To extend this idea, let $a↓1 = 2$
and $a↓{j+1}
= a↑{q↓j}↓{j}\mod N$, where $q↓j = p↑{e↓j}↓{j}$, $p↓j$ is the
$j$th prime, and $e↓j = \lfloor \log 1000/\!\log p↓j\rfloor $;
let $A = a↓{169}$. Now compute $b↓q =\gcd(A↑q - 1, N)$ for
all primes $q$ between $10↑3$ and $10↑5$. One way to do this is
to start with $A↑{1009}\mod N$ and then to multiply alternately
by $A↑4\mod N$ and $A↑2\mod N$.\xskip (A similar method was used
in the 1920s by D. N. Lehmer, but he didn't publish it.)\xskip As
with Algorithm B we can avoid most of the gcd's by batching;
e.g., since $b↓{30-k} =\gcd(A↑{30r} - A↑k, N)$, we might try
batches of\penalty1000\ 8, computing $c↓r = (A↑{30r} - A↑{29})(A↑{30r} -
A↑{23}) \ldotsm (A↑{30r} - A)\mod N$, then $\gcd(c↓r, N)$ for
$33 < r ≤ 3334$.

\ansno 22. Algorithm P fails only when the random
number $x$ does not reveal the fact that $n$ is nonprime. 
Say $x$ is {\sl bad\/} if $x↑q\mod n=1$ or
if one of the numbers $x↑{2↑jq}$ is $≡-1\modulo n$ for $0≤j<k$. Since 1 is bad,
we have $p↓n=(b↓n-1)/(n-2)<b↓n/(n-1)$, where $b↓n$ is the number of bad $x$
such that $1≤x<n$, when $n$ is not prime.

Every bad $x$ satisfies $x↑{n-1}≡1\modulo n$. When $p$ is prime, the number of
solutions to the congruence $x↑q≡1\modulo{p↑e}$ for $1≤x≤p↑e$ is the number of
solutions of $qy≡0$ $\biglp$\hbox{modulo}\penalty1000\ $p↑{e-1}(p-1)\bigrp$
for $0≤y<p↑{e-1}(p-1)$, namely $\gcd\biglp q,p↑{e-1}(p-1)\bigrp$, since we
may replace $x$ by $a↑y$ where $a$ is a primitive root.

Let $n=n↓1↑{e↓1}\ldotss n↓r↑{e↓r}$, where the $n↓i$ are distinct primes. According
to the
Chinese remainder theorem, the number of solutions to the congruence
$x↑{n-1}≡1\modulo n$ is
$\prod↓{1≤i≤r}\gcd\biglp n-1,n↓i↑{e↓i-1}(n↓i-1)\bigrp$, and this is at most
$\prod↓{1≤i≤r}(n↓i-1)$ since $n↓i$ is relatively prime to $n-1$. If
some $e↓i>1$, we have $n↓i-1≤{2\over9}n↓i↑{e↓i}$, hence the number of solutions
is at most ${2\over9}n$; in this case $b↓n≤{2\over9}n≤{1\over4}(n-1)$, since
$n≥9$.

Therefore we may assume that $n$ is the product $n↓1\ldotsm n↓r$ of distinct
primes. Let $n↓i=1+2↑{k↓i}q↓i$, where $k↓1≤\cdots≤k↓r$. Then $\gcd(n-1,n↓i-1)
=2↑{k↑\prime↓i}q↑\prime↓i$, where $k↑\prime↓i=\min(k,k↓i)$ and $q↑\prime↓i=
\gcd(q,q↓i)$. Modulo $n↓i$, the number of $x$ such that $x↑q≡1$ is $q↑\prime↓i$;
and the number of $x$ such that $x↑{2↑jq}≡-1$ is $2↑jq↓i↑\prime$ for
$0≤j<k↓i↑\prime$, otherwise 0. Since $k≥k↓1$, we have $b↓n=q↓1↑\prime\ldotsm
q↓r↑\prime\,\biglp1+\sum↓{0≤j<k↓1}2↑{jr}\bigrp$.

To complete the proof, it suffices to show that $b↓n≤{1\over4}q↓1\ldotsm q↓r
2↑{k↓1+\cdots+k↓r}={1\over4}\varphi(n)$, since $\varphi(n)<n-1$. We have
$\biglp1+\sum↓{0≤j<k↓1}2↑{jr}\bigrp\hbox{\:a/}2↑{k↓1+\cdots+k↓r}≤
\biglp1+\sum↓{0≤j<k↓1}2↑{jr}\bigrp\hbox{\:a/}2↑{k↓1r}=1/(2↑r-1)+
(2↑r-2)\hbox{\:a/}\biglp2↑{k↓1r}(2↑r-1)\bigrp≤1/2↑{r-1}$, so the result follows
unless $r=2$ and $k↓1=k↓2$. If $r=2$, exercise 9 shows that $n-1$ is not a
multiple of both $n↓1-1$ and $n↓2-1$. Thus if $k↓1=k↓2$ we cannot have both
$q↓1↑\prime=q↓1$ and $q↓2↑\prime=q↓2$; it follows that $q↓1↑\prime q↓2↑\prime
≤{1\over3}q↓1q↓2$ and $b↓n≤{1\over6}\varphi(n)$ in this case.

\yyskip [{\sl Reference: J. Number Theory} ({\sl c}. 1979), to appear.
The above proof shows that $p↓n$ is near $1\over 4$ only in two cases, when $n=
(1+2q↓1)(1+4q↓1)$ or $(1+2q↓1)(1+2q↓2)(1+2q↓3)$. For example, when $n=
49939\cdot99877$ we have $b↓n={1\over4}(49938\cdot99876)$ and $p↓n\approx
.2499925$. See the next answer for further remarks.]

\ansno 23. (a)\9 The proofs are simple except perhaps for the reciprocity law.
Let $p=p↓1\ldotsm p↓s$ and $q=q↓1\ldotsm q↓r$, where the $p↓i$ and $q↓j$ are
prime. Then
$$\bigglp{p\over q}\biggrp=\prod↓{i,j}\bigglp{p↓i\over q↓j}\biggrp
=\prod↓{i,j}(-1)↑{(p↓i-1)(q↓j-1)/4}\,\bigglp{q↓j\over p↓i}\biggrp=
(-1)↑{\vcenter{\hbox{\:b\char6}}↓{i,j}
(p↓i-1)(q↓j-1)/4}\,\bigglp{q\over p}\biggrp,$$
so we need only verify that $\sum↓{i,j\,}(p↓i-1)(q↓j-1)/4≡(p-1)(q-1)/4\modulo 2$.
But $\sum↓{i,j\,}(p↓i-1)(q↓j-1)/4=\biglp\sum↓i(p↓i-1)/2\bigrp\biglp\sum↓j
(q↓j-1)/2\bigrp$ is odd iff an odd number of the $p↓i$ and an odd number of the
$q↓j$ are $≡3\modulo 4$, and this holds iff $(p-1)(q-1)/4$ is odd.

\def\\#1{\raise 2pt\hbox{$\scriptstyle#1$}}
(b) As in exercise 22, we may assume that $n=n↓1\ldotsm n↓r$ where the $n↓i=1+
2↑{k↓i}q↓i$ are distinct primes, and $k↓1≤\cdots≤k↓r$;
we let $\gcd(n-1,n↓i-1)=2↑{k↑\prime↓i}q↑\prime↓i$ and we call
$x$ {\sl bad} if it falsely makes $n$ look prime. Let $\Pi↓n=\prod↓{1≤i≤r}
q↓i↑\prime\,2↑{\min(k↓i,k-1)}$ be the number of solutions of $x↑{(n-1)/2}≡1$.
The number of bad $x$ with $({\\x\over n})=1$ is $\Pi↓n$, times an extra factor
of $1\over2$ if $k↓1<k$.\xskip (This factor $1\over2$ is needed to
ensure that $({\\x\over n↓i})=-1$ 
for an even number of the $n↓i$ with $k↓i<k$.)\xskip
The number of bad $x$ with $({\\x\over n})=-1$ is $\Pi↓n$ if $k↓1
=k$, otherwise 0.\xskip$\biglp$If $x↑{(n-1)/2}≡-1\modulo{n↓i}$, we have $({\\x\over
n↓i})=-1$ if $k↓i=k$, $({\\x\over n↓i})=+1$ if $k↓i>k$, and a contradiction if
$k↓i<k$. If $k↓1=k$, there are an odd number of $k↓i$ equal to
$k$.$\bigrp$

{\sl Notes:} The probability of a 
bad guess is $>{1\over4}$ only if $n$ is a Carmichael
number with $k↓r<k$; for example, $n=7\cdot13\cdot19=1729$,
a number made famous by Ramanujan in another context. It is interesting to
compare the procedure of this exercise with Algorithm P\null; Louis Monier has shown
that every $x$ that is bad for Algorithm P is also bad for the Solovay-Strassen
test, therefore Algorithm P is always better.
He has also extended the above analyses to obtain the following closed
formulas for the number of bad $x$ in general:
$$\eqalign{b↓n⊗=\bigglp1+{2↑{rk↓1}-1\over2↑r-1}\biggrp\prod↓{1≤i≤r}q↓i↑\prime;\cr
b↓n↑\prime⊗=\delta↓n\prod↓{1≤i≤r}\gcd\left({n-1\over2},\;n↓i-1\right).\cr}$$
Here $b↓n↑\prime$ is the number of bad $x$ in this exercise, and $\delta↓n$ is
either 2 (if $k↓1=k$), or $1\over2$ (if $k↓i<k$ and $e↓i$ is odd for some $i$), or
1 (otherwise).

\ansno 24. Let $M↓1$ be a matrix having one row for each nonprime odd number in the
range $1≤n≤N$
and having $N-1$ columns numbered from 2 to $N$; 
the entry in row $n$ column $x$ is 1 if $n$ passes the $x$ test of Algorithm
P\null, otherwise it is zero. When $N=qn+r$ and $0≤r<n$, we 
know that row $n$ contains at most ${1\over4}qn+\min\biglp{1\over4}n,r\bigrp=
{1\over4}N+\min\biglp{1\over4}n-{1\over4}r,{3\over4}r\bigrp≤{1\over4}N+{3\over16}n
<{1\over2}N$ entries equal to 0, so at least
half of the entries in the matrix are 1. Thus, some column $x↓1$ of $M↓1$ has
at least half of its entries equal to 1. Removing column $x↓1$ and all rows in
which this column contains 1 leaves a matrix $M↓2$ having similar properties; a
repetition of this construction produces matrix $M↓r$ with $N-r$ columns and
fewer than $N↓{\null}/2↑r$ rows, and with
at least ${1\over2}(N-1)$ entries per row equal to 1.\xskip
[Cf.\ {\sl Proc.\ IEEE Symp.\
Foundations of Comp.\ Sci.\ \bf19} (1978), 78.]

$\biglp$A similar proof implies the existence of a {\sl single} infinite
sequence $x↓1<x↓2<\cdots$ such that the number $n>1$ is prime if and only if it
passes the $x$ test of Algorithm P for $x=x↓1$, $\ldotss$, $x=x↓m$, where $m=
{1\over2}\lfloor\lg n\rfloor\biglp\lfloor\lg n\rfloor-1\bigrp$. Is there a
sequence $x↓1<x↓2<\cdots$ having this property but with $m=O(\log n)$?$\bigrp$

\ansno 25. Note that $x\mod y=x-y\,\lfloor x/y\rfloor$ can be computed easily on
such a machine, and we can get simple constants like $0=x-x$, $1=\lfloor
x/x\rfloor$, $2=1+1$; we can test $x>0$ by testing whether $x=1$ or $\lfloor
x/(x-1)\rfloor≠0$.

(a)\9 First compute $l=\lfloor\lg n\rfloor$ in $O(\log n)$ steps, by repeatedly
dividing by 2; at the same time compute $k=2↑l$ and $A←2↑{2↑{l+1}}$ in
$O(\log n)$ steps by repeatedly setting $k←2k$, $A←A↑2$. For the main
computation, suppose we know that $t=A↑m$, $u=(A+1)↑m$, and $v=m!$; then
we can increase the value of $m$ by 1 by setting $m←m+1$, $t←At$, $u←(A+1)u$,
$v←vm$; and we can {\sl double} the value of $m$ by setting $m←2m$, $u←u↑2$,
$v←\biglp\lfloor u/t\rfloor\mod A\bigrp v↑2$, $t←t↑2$, provided that $A$ is
sufficiently large.\xskip$\biglp$Consider the number $u$ in radix-$A$ notation;
$A$ must be greater than $2m\choose m$.$\bigrp$\xskip Now if $n=(a↓l\ldotsm a↓0)
↓2$, let $n↓j=(a↓l\ldotsm a↓j)↓2$; if $m=n↓j$ and $k=2↑j$ and $j>0$ we can
decrease $j$ by 1 by setting $k←\lfloor k/2\rfloor$, $m←2m+\biglp\lfloor n/k
\rfloor\mod2\bigrp$. Hence we can compute $n↓j!$ for $j=l$, $l-1$, $\ldotss$,
0 in $O(\log n)$ steps.\xskip[Another solution, due to Julia Robinson, is to
compute $n!=\lfloor B↑n/{B\choose n}\rfloor$ when $B>(2n)↑{n+1}$; cf.\
{\sl AMM \bf80} (1973), 250-251, 266.]

(b)\9 First compute $A=2↑{2↑{l+2}}$ as in (a), then find the least $k≥0$ such that
$2↑{k+1}!\mod n=0$. If $\gcd(n,2↑k!)≠1$, let $f(n)$ be this value; note that this
gcd can be computed in $O(\log n)$ steps by Euclid's algorithm. Otherwise we will
find the least integer $m$ such that ${m\choose\lfloor m/2\rfloor}\mod n=0$, and
let $f(n)=\gcd(m,n)$.\xskip$\biglp$Note that in this case $2↑k<m≤2↑{k+1}$,
hence $\lceil m/2\rceil≤2↑k$ and $\lceil m/2\rceil!$ is relatively prime to
$n$; therefore ${m\choose\lfloor m/2\rfloor}\mod n=0$ iff $m!\mod n=0$.
Furthermore $n≠4$.$\bigrp$

To compute $m$ with a bounded number of registers, we can use Fibonacci numbers
(cf.\ Algorithm 6.2.1F\null). Suppose we know that
$s=F↓j$, $s↑\prime=F↓{j+1}$, $t=A↑{F↓j}$, $t↑\prime=
A↑{F↓{j+1}}$, $u=(A+1)↑{2F↓j}$, $u↑\prime=(A+1)↑{2F↓{j+1}}$, $v=A↑m$,
$w=(A+1)↑{2m}$, ${2m\choose m}\mod n≠0$, and ${2(m+s)\choose m+s}=0$.
It is easy to reach this state of affairs with $m=F↓{j+1}$, for suitably large
$j$, in $O(\log n)$ steps; furthermore $A$ will be larger than $2↑{2(m+s)}$.
If $s=1$, we set $f(n)=\gcd(2m+1,n)$ or $\gcd(2m+2,n)$, whichever is $≠1$,
and terminate the algorithm. Otherwise we reduce $j$ by 1 as follows: Set
$r←s$, $s←s↑\prime-s$, $s←r$, $r←t$, $t←\lfloor t↑\prime/t\rfloor$, $t↑\prime←r$,
$r←u$, $u←\lfloor u↑\prime/u\rfloor$, $u↑\prime←r$; then
if $\biglp\lfloor wu/vt\rfloor \mod A\bigrp\mod n≠0$, set $m←m+s$, $w←wu$, $v←vt$.

[Can this problem be solved with fewer than $O(\log n)$ operations? Can the
smallest, or the largest, prime factor of $n$ be computed in $O(\log n)$
operations?]
%folio 795 galley 11b (C) Addison-Wesley 1978	*
\ansbegin{4.6}

\ansno 1. $9x↑2 + 7x + 9$;\xskip $5x↑3 + 7x↑2
+ 2x + 6$.

\ansno 2. (a)\9 True.\xskip (b) False if the algebraic
system $S$ contains ``zero divisors,'' nonzero numbers whose
product is zero, as in exercise 1; otherwise true.\xskip (c) True when $m≠n$, but
false in general when $m=n$, since the leading coefficients might cancel.

\ansno 3. Assume that $r ≤ s$. For $0
≤ k ≤ r$ the maximum is $m↓1m↓2(k + 1)$; for $r ≤ k ≤ s$ it
is $m↓1m↓2(r + 1)$; for $s ≤ k ≤ r + s$ it is $m↓1m↓2(r + s
+ 1 - k)$. The least upper bound valid for all $k$ is $m↓1m↓2(r
+ 1)$.\xskip (The solver of this exercise will know how to factor
the polynomial $x↑7 + 2x↑6 + 3x↑5 + 3x↑4 + 3x↑3 + 3x↑2 + 2x
+ 1$.)

\ansno 4. If one of the polynomials has fewer than
$2↑t$ nonzero coefficients, the product can be formed by putting
exactly $t - 1$ zeros between each of the coefficients, then
multiplying in the binary number system, and finally using a
logical \.{AND} operation (present on most binary computers, cf.\
Section 4.5.4) to zero out the extra bits. For example, if $t
= 3$, the multiplication in the text would become $(1001000001)↓2
\times (1000001001)↓2 = (1001001011001001001)↓2$; if we \.{AND}
this result with the constant $(1001001 \ldotsm 1001)↓2$, the desired
answer is obtained. A similar technique can be used to multiply
polynomials with nonnegative coefficients, when it is known that
the coefficients will not be too large.

\ansno 5. Polynomials of degree $≤2n$ can be represented
as $U↓1(x)x↑n + U↓0(x)$ where deg$(U↓1)$ and deg$(U↓0) ≤ n$;
and $\biglp U↓1(x)x↑n + U↓0(x)\bigrp\biglp V↓1(x)x↑n
+ V↓0(x)\bigrp = U↓1(x)V↓1(x)(x↑{2n} + x↑n) + \biglp U↓1(x)
+ U↓0(x)\bigrp\biglp V↓1(x) + V↓0(x)\bigrp x↑n + U↓0(x)V↓0(x)(x↑n
+ 1)$.\xskip (This equation assumes that arithmetic is being done
modulo 2.)\xskip Thus Eqs.\ 4.3.3--3, 4, 5 hold.

{\sl Notes:} S. A. Cook has shown that Algorithm
4.3.3C can be extended in a similar way, and exercise 4.6.4--57
describes a method requiring even fewer operations for large
$n$. But these ideas are not useful for ``sparse'' polynomials
(having mostly zero coefficients).

%folio 796 galley 11c (C) Addison-Wesley 1978	*
\def\\#1({\mathop{\hbox{#1}}(}\def\+#1\biglp{\mathop{\hbox{#1}}\biglp}
\ansbegin{4.6.1}

\ansno 1. $q(x) = 1 \cdot 2↑3x↑3 + 0
\cdot 2↑2x↑2 - 2 \cdot 2x + 8 = 8x↑3 - 4x + 8$;\xskip$r(x) = 28x↑2
+ 4x + 8$.

\ansno 2. The monic sequence of polynomials produced
during Euclid's algorithm has the coefficients $(1, 5, 6, 6,
1, 6, 3)$, $(1, 2, 5, 2, 2, 4, 5)$, $(1, 5, 6, 2, 3, 4)$, $(1, 3,
4, 6)$, 0. Hence the greatest common divisor is $x↑3 + 3x↑2 +
4x + 6$.\xskip (The greatest common divisor of a polynomial and its
reverse is always symmetric, in the sense that it is a unit
multiple of its own reverse.)

\ansno 3. The procedure of Algorithm 4.5.2X is
valid, with polynomials over $S$ substituted for integers. When
the algorithm terminates, we have $U(x) = u↓2(x)$, $V(x) = u↓1(x)$.
Let $m =\\deg(u)$, $n =\\deg(v)$. It is easy to prove by induction
that $\\deg(u↓3) +\\deg(v↓1) = n$, $\\deg(u↓3) +\\deg(v↓2) = m$,
after step X3, throughout the execution of the algorithm, provided
that $m ≥ n$. Hence if $m$ and $n$ are greater than $d =\+deg\biglp
\gcd(u, v)\bigrp$ we have $\\deg(U) < m - d$, $\\deg(V) < n - d$;
the exact degrees are $m - d↓1$ and $n - d↓1$, where $d↓1$ is
the degree of the second-last nonzero remainder. If $d = \min(m,
n)$, say $d = n$, we have $U(x) = 0$ and $V(x) = 1$.

When $u(x) = x↑m - 1$ and $v(x) = x↑n - 1$, the
identity $(x↑m - 1)\mod(x↑n - 1) = x↑{m\mod n} - 1$ shows that
all polynomials occurring during the calculation are monic, with
integer coefficients. When $u(x) = x↑{21} - 1$ and $v(x) = x↑{13}
- 1$, we have $V(x) = x↑{11} + x↑8 + x↑6 + x↑3 + 1$ and $U(x) =
-(x↑{19} + x↑{16} + x↑{14} + x↑{11} + x↑8 + x↑6 + x↑3 + x)$.\xskip
$\biglp$See also Eq.\ 3.3.3--29, which gives an alternative formula
for $U(x)$ and $V(x)$.$\bigrp$

\ansno 4. Since the quotient $q(x)$ depends only on $v(x)$
and the first $m - n$ coefficients of $u(x)$, the remainder
$r(x) = u(x) - q(x)v(x)$ is uniformly distributed and independent
of $v(x)$. Hence each step of the algorithm may be regarded
as independent of the others; this algorithm is much more well-behaved
than Euclid's algorithm over the integers.

The probability that $n↓1 = n - k$ is $p↑{1-k}(1
- 1/p)$, and $t = 0$ with probability $p↑{-n}$. Each succeeding
step has essentially the same behavior; hence we can see that
any given sequence of degrees $n$, $n↓1$, $\ldotss$, $n↓t$, $-∞$ occurs
with probability $(p - 1)↑t/p↑n$. To find the average value
of $f(n↓1, \ldotss , n↓t)$, let $S↓t$ be the sum of $f(n↓1, \ldotss
, n↓t)$ over all sequences $n > n↓1 >\cdots > n↓t
≥ 0$ having a given value of $t$; then the average is $\sum
↓t S↓t(p - 1)↑t/p↑n$.

Let $f(n↓1, \ldotss , n↓t) = t$; then
$S↓t = {n\choose t}(t + 1)$, so the average is $n(1 - 1/p)$. Similarly,
if $f(n↓1, \ldotss , n↓t) = n↓1 +\cdots + n↓t$, then
$S↓t = {n\choose2}{n-1\choose t-1}$, and the average is ${n\choose2}(1
- 1/p)$. Finally, if $f(n↓1, \ldotss , n↓t) = (n - n↓1)n↓1 +\cdots
+ (n↓{t-1} - n↓t)n↓t$, then $S↓t = {{n+2\choose t+2}
- (n + 1){n+1\choose t+1}}+ {n+1\choose2}{n\choose t}$, and the
average is ${{n+1\choose2} - (n + 1)p/(p - 1)} + {\biglp p/(p -
1)\bigrp ↑2(1 - 1/p↑{n+1})}$.

As a consequence we can see that if $p$ is large
there is very high probability that $n↓{j+1} = n↓j - 1$ for
all $j$.\xskip (If this condition fails over the rational numbers,
it fails for all $p$, so we have further evidence for the text's
claim that Algorithm C almost always finds $\delta↓2 =\cdots
= 1$.)

\ansno 5. Using the formulas developed
in exercise 4, with $f(n↓1, \ldotss , n↓t) = \delta ↓{n↓t0}$,
we find that the probability is $1 - 1/p$ if $n > 0$, 1 if
$n = 0$.

\ansno 6. Assuming that the constant terms $u(0)$
and $v(0)$ are nonzero, imagine a ``right-to-left'' division
algorithm, $u(x) = v(x)q(x) + x↑{m-n}r(x)$, where $\\deg(r) <\\deg(v)$.
We obtain a gcd algorithm anlogous to Algorithm 4.5.2B\null, which
is essentially Euclid's algorithm applied to the ``reverse''
of the original inputs (cf.\ exercise 2), afterwards reversing
the answer and multiplying by an appropriate power of $x$.

\ansno 7. The units of $S$ (as polynomials of degree
zero).
%folio 797 galley 12 (C) Addison-Wesley 1978	*
\def\\#1({\mathop{\hbox{#1}}(}\def\+#1\biglp{\mathop{\hbox{#1}}\biglp}
\ansno 8. If $u(x) = v(x)w(x)$, where $u(x)$ has
integer coefficients while $v(x)$ and $w(x)$ have rational coefficients,
there are integers $m$ and $n$ such that $m\cdot v(x)$ and $n \cdot
w(x)$ have integer coefficients. Now $u(x)$ is primitive, so
we have
$$u(x) = \pm\,\+pp\biglp m \cdot v(x)\bigrp\+pp\biglp n \cdot w(x)\bigrp.$$

\ansno 9. We can extend Algorithm E as follows:
Let $\biglp u↓1(x), u↓2(x), u↓3, u↓4(x)\bigrp$ and $\biglp v↓1(x),
v↓2(x)$, $v↓3, v↓4(x)\bigrp$ be quadruples that satisfy the relations
$u↓1(x)u(x) + u↓2(x)v(x) = u↓3u↓4(x)$, \ $v↓1(x)u(x) + v↓2(x)v(x)
= v↓3v↓4(x)$. The extended algorithm starts with the quadruples $\biglp 1, 0,
\\cont(u), \\pp(u(x))\bigrp$ and $\biglp 0, 1, \\cont(v), \\pp(v(x))\bigrp$
and manipulates them in such a way as to preserve
the above conditions, where $u↓4(x)$ and $v↓4(x)$ run through the
same sequence as $u(x)$ and $v(x)$ do in Algorithm E\null. If $au↓4(x)
= q(x)v↓4(x) + br(x)$, we have $av↓3\biglp u↓1(x), u↓2(x)\bigrp
- q(x)u↓3\biglp v↓1(x), v↓2(x)\bigrp = \biglp r↓1(x), r↓2(x)\bigrp
$, where $r↓1(x)u(x) + r↓2(x)v(x) = bu↓3v↓3r(x)$, so the extended
algorithm can preserve the desired relations. If $u(x)$ and
$v(x)$ are relatively prime, the extended algorithm eventually
finds $r(x)$ of degree zero, and we obtain $U(x) = r↓2(x)$, $V(x)
= r↓1(x)$ as desired.\xskip$\biglp$In practice we would divide $r↓1(x)$,
$r↓2(x)$, and $bu↓3v↓3$ by $\gcd\biglp\\cont(r↓1),\\ cont(r↓2)\bigrp$.$\bigrp
$\xskip Conversely, if such $U(x)$ and $V(x)$ exist, then $u(x)$ and
$v(x)$ have no common prime divisors, since they are primitive
and have no common divisors of positive degree.

\ansno 10. By successively factoring polynomials
that are reducible into polynomials of smaller degree, we must
obtain a finite factorization of any polynomial into irreducibles.
The factorization of the {\sl content} is unique. To show that
there is at most one factorization of the primitive part, the
key result is to prove that if $u(x)$ is an irreducible factor
of $v(x)w(x)$, but not a unit multiple of the irreducible polynomial
$v(x)$, then $u(x)$ is a factor of $w(x)$. This can be proved
by observing that $u(x)$ is a factor of $v(x)w(x)U(x) = rw(x)
- w(x)u(x)V(x)$ by the result of exercise 9, where $r$ is a
nonzero constant.

\ansno 11. The only row names needed
would be $A↓1$, $A↓0$, $B↓4$,
$B↓3$, $B↓2$, $B↓1$, $B↓0$, $C↓1$, $C↓0$, $D↓0$. In general, let $u↓{j+2}(x)
= 0$; then the rows needed for the proof are $A↓{n↓2-n↓j}$
through $A↓0$, $B↓{n↓1-n↓j}$ through $B↓0$, $C↓{n↓2-n↓j}$ through $C↓0$, 
$D↓{n↓3-n↓j}$ through $D↓0$, etc.

\ansno 12. If $n↓k = 0$, the text's proof of (24) shows that the value of
the determinant is $\pm h↓k$, and this equals $\pm\lscr↓k↑{n↓{k-1}}/
\prod↓{1<j<k}\lscr↓{\!j}↑{\delta↓{j-1}(\delta↓j-1)}$.
If the polynomials have a factor
of positive degree, we can artificially assume that the polynomial
zero has degree zero and use the same formula with $\lscr↓k=
0$.

{\sl Notes:} The value $R(u, v)$ of Sylvester's determinant
is called the {\sl resultant} of $u$ and $v$, and the quantity
$(-1)↑{\hbox{\:e deg}(u)(\hbox{\:e deg}(u)-1)/2}\lscr(u)↑{-1}R(u, u↑\prime )$ is
called the {\sl discriminant} of $u$, where $u↑\prime$ is the derivative of $u$. 
If $u(x) = a(x - α↓1)
\ldotsm (x - α↓m)$ and $v(x) = b(x - β↓1) \ldotsm (x - β↓n)$,
we have $R(u, v) = a↑nv(α↓1) \ldotsm v(α↓m) = (-1)↑{mn}b↑mu(β↓1)
\ldotsm u(β↓n) = a↑nb↑m \prod↓{1≤i≤m,1≤j≤n}(α↓i - β↓j)$. It follows
that the polynomials of degree $mn$ in $y$ defined as the respective
resultants with $v(x)$ of $u(y - x)$, $u(y + x)$, $x↑mu(y/x)$, and $u(yx)$
have as respective roots the sums $α↓i + β↓j$, differences
$α↓i - β↓j$, products $α↓iβ↓j$, and quotients $α↓i/β↓j$ $\biglp$when
$v(0) ≠ 0\bigrp $. This idea has been used by R. G. K.
Loos to construct algorithms for arithmetic on algebraic numbers
[{\sl SIAM J. Computing} ({\sl c}. 1979), to appear].

If we replace each row $A↓i$ in Sylvester's matrix by
$$(b↓0A↓i + b↓1A↓{i+1} +\cdots + b↓{n↓2-1-i}A↓{n↓2-1})
- (a↓0B↓i + a↓1B↓{i+1} + \cdots + a↓{n↓2-1-i}B↓{n↓2-1}),$$
and then delete rows $B↓{n↓2-1}$
through $B↓0$ and the last $n↓2$ columns, we obtain an $n↓1
\times n↓1$ determinant for the resultant instead of the original
$(n↓1 + n↓2) \times (n↓1 + n↓2)$ determinant. In some cases
the resultant can be evaluated efficiently by means of this
determinant; see {\sl CACM \bf 12} (1969), 23--30, 302--303.

J. T. Schwartz has shown that resultants and Sturm sequences for
polynomials of degree $n$ can be evaluated with $O\biglp n(\log n)↑2\bigrp$
operations as $n→∞$.\xskip [``Probabilistic algorithms for verification of
polynomial identities,'' to appear.]

\ansno 13. One can show by induction on $j$ that the values of $\biglp
u↓{j+1}(x),g↓{j+1},h↓j\bigrp$ are replaced respectively by $\biglp\lscr↑{1+p↓{j\,}}
w(x)u↓j(x),\lscr↑{2+p↓{j\,}}g↓j,\lscr↑{p↓{j\,}}h↓j\bigrp$ for $j≥2$, 
where $p↓j=n↓1+n↓2-2n↓j$.\xskip
$\biglp$In spite of this growth, the bound (26) remains valid.$\bigrp$

\ansno 14. Let $p$ be a prime of the domain, and let $j,k$ be
maximum such that $p↑k\rslash v↓n = \lscr(v)$, $p↑j\rslash v↓{n-1}$.
Let $P = p↑k$. By Algorithm R we may write $q(x) = a↓0 + Pa↓1x
+\cdots + P↑sa↓sx↑s$, where $s = m - n ≥ 2$. Let
us look at the coefficients of $x↑{n+1}$, $x↑n$, and $x↑{n-1}$
in $v(x)q(x)$, namely $Pa↓1v↓n + P↑2a↓2v↓{n-1} +\cdotss$, $a↓0v↓n
+ Pa↓1v↓{n-1} +\cdotss$, and $a↓0v↓{n-1} + Pa↓1v↓{n-2} +\cdotss$,
each of which is a multiple of $P↑3$. We conclude from the first
that $p↑j\rslash a↓1$, from the second that $p↑{\min(k,2j)}\rslash
a↓0$, then from the third that $P\rslash a↓0$. Hence $P\rslash
r(x)$.\xskip $\biglp$If $m$ were only $n + 1$, the best we could prove would
be that $p↑{\lceil k/2\rceil }$ divides $r(x)$; e.g., consider
$u(x) = x↑3 + 1$, $v(x) = 4x↑2 + 2x + 1$, $r(x) = 18$. On the other hand, an
argument based on determinants of matrices like (21) and (22) can be used to
show that \def\\{\hbox{\:e deg}}$\lscr(r)↑{\\(v)-\\(r)-1}r(x)$ is always a
multiple of $\lscr(v)↑{(\\(u)-\\(v))(\\(v)-\\(r)-1)}$.$\bigrp$
%folio 800 galley 13 (C) Addison-Wesley 1978	*
\def\\#1({\mathop{\hbox{#1}}(}\def\+#1\biglp{\mathop{\hbox{#1}}\biglp}
\ansno 15. Let $c↓{ij} = a↓{i1}a↓{j1} +\cdots +
a↓{in}a↓{jn}$; we may assume that $c↓{ii} > 0$ for all $i$.
If $c↓{ij} ≠ 0$ for some $i ≠ j$, we can replace row $i$ by
$(a↓{i1} - ca↓{j1}, \ldotss , a↓{in} - ca↓{jn})$, where $c =
c↓{ij}/c↓{jj}$; this does not change the value of the determinant,
and it decreases the value of the upper bound we wish to prove,
since $c↓{ii}$ is replaced by $c↓{ii} - c↑{2}↓{ij}/c↓{ij}$.
These replacements can be done in a systematic way for increasing
$i$ and for $j < i$, until $c↓{ij} = 0$ for all $i ≠ j$.\xskip $\biglp$The
latter algorithm is called ``Schmidt's orthogonalization process'';
see {\sl Math.\ Annalen \bf 63} (1907), 442.$\bigrp$ Then $\det(A)↑2
= \det(AA↑T) = c↓{11} \ldotsm c↓{nn}$.

\ansno 16. Let $f(x↓1, \ldotss , x↓n) =
g↓m(x↓2, \ldotss , x↓n)x↑{m}↓{1} +\cdots + g↓0(x↓2,
\ldotss , x↓n)$, and let $g(x↓2, \ldotss , x↓n) = g↓m(x↓2, \ldotss
, x↓n)↑2 +\cdots + g↓0(x↓2, \ldotss , x↓n)↑2$; the
latter is not identically zero. We have $a↓N ≤ m(2N + 1)↑{n-1}
+ (2N + 1)b↓N$, where $b↓N$ counts the integer solutions
of $g(x↓2, \ldotss , x↓n) = 0$ with variables bounded by $N$.
Hence $\lim↓{N→∞} a↓N/(2N + 1)↑n = \lim↓{N→∞} b↓N/(2N + 1)↑{n-1}$,
and this is zero by induction.

\ansno 17. (a)\9 For convenience, let us describe the algorithm
only for $A = \{a, b\}$. The hypotheses imply that $\\deg(Q↓1U)
=\\deg(Q↓2V) ≥ 0$, and $\\deg(Q↓1) ≤\\deg(Q↓2)$. If $\\deg(Q↓1)
= 0$, then $Q↓1$ is just a nonzero rational number, so we set
$Q = Q↓2/Q↓1$. Otherwise we let $Q↓1 = aQ↓{11} + bQ↓{12} + r↓1$,
$Q↓2 = aQ↓{21} + bQ↓{22} + r↓2$, where $r↓1$ and $r↓2$ are rational
numbers; it follows that
$$Q↓1U - Q↓2V = a(Q↓{11}U - Q↓{21}V) + b(Q↓{12}U - Q↓{22}V)
+ r↓1U - r↓2V.$$
We must have either $\\deg(Q↓{11}) =\\deg(Q↓1) -
1$ or $\\deg(Q↓{12}) =\\deg(Q↓1) - 1$. In the former case, $\\deg(Q↓{11}U
- Q↓{21}V) <\\deg(Q↓{11}U)$, by considering the terms of highest
degree that start with $a$; so we may replace $(Q↓1, Q↓2)$
by $(Q↓{12}, Q↓{22})$ and repeat the process.

(b)\9 We may assume that $\\deg(U) ≥\\deg(V)$. If
$\\deg(R) ≥\\deg(V)$, note that $Q↓1U - Q↓2V = Q↓1R - {(Q↓2 -
Q↓1Q)}V$ has degree less than $\\deg(V) ≤\\deg(Q↓1R)$, so we can
repeat the process with $U$ replaced by $R$; we obtain $R =
Q↑\prime V + R↑\prime$, $U = (Q + Q↑\prime )V + R↑\prime $, where
$\\deg(R↑\prime ) <\\deg(R)$, so eventually a solution will be
obtained.

(c)\9 The algorithm of (b) gives $V↓1 = UV↓2 + R$,
$\\deg(R) <\\deg(V↓2)$; by homogeneity, $R = 0$ and $U$ is homogeneous.

(d)\9 We may assume that $\\deg(V) ≤\\deg(U)$. If
$\\deg(V) = 0$, set $W ← U$; otherwise use (c) to find $U = QV$,
so that $QVV = VQV$, $(QV - VQ)V = 0$. This implies that $QV =
VQ$, so we can set $U ← V$, $V ← Q$ and repeat the process.

For further details about the subject of this
exercise, see P. M. Cohn, {\sl Proc.\ Cambridge Phil.\ Soc.\ \bf 57}
(1961), 18--30. The considerably more difficult problem of characterizing
{\sl all} string polynomials such that $UV = VU$ has been solved
by G. M. Bergman [Ph.D. thesis, Harvard University, 1967].

\ansno 18. [P. M. Cohn, {\sl Transactions Amer.\ Math.\ Soc.\ \bf 109}
(1963), 332--356.]

\yskip\hang\textindent{\bf C1.} Set $u↓1 ← U↓1$, $u↓2
← U↓2$, $v↓1 ← V↓1$, $v↓2 ← V↓2$, $z↓1 ← z↑\prime↓{2} ← w↓1 ← w↑\prime↓{2}
← 1$, $z↑\prime↓{1} ← z↓2 ← w↑\prime↓{1} ← w↓2 ← 0$, $n
← 0$.

\yskip\hang\textindent{\bf C2.} (At this point the identities
given in the exercise hold, and also $u↓1v↓1 = u↓1v↓2$; $v↓2 =
0$ if and only if $u↓1 = 0$.)\xskip If $v↓2 = 0$, the algorithm terminates
with gcrd$(V↓1, V↓2) = v↓1$, lclm$(V↓1, V↓2) = z↑\prime↓{1}V↓1
= -z↑\prime↓{2}V↓2$.\xskip (Also, by symmetry, gcld$(U↓1, U↓2) =
\\lclm(U↓1, U↓2) = U↓1w↓1 = -U↓2w↓2$.)

\yskip\hang\textindent{\bf C3.} Find $Q$ and $R$ such that $v↓1
= Qv↓2 + R$, where $\\deg(R) <\\deg(v↓2)$.\xskip$\biglp$We have $u↓1(Qv↓2
+ R) = u↓2v↓2$, so $u↓1R = (u↓2 - u↓1Q)v↓2 = R↑\prime v↓2.\bigrp$

\yskip\hang\textindent{\bf C4.} Set $(w↓1$, $w↓2$, $w↑\prime↓{1}$,
$w↑\prime↓{2}$, $z↓1$, $z↓2$, $z↑\prime↓{1}$, $z↑\prime↓{2}$,
$u↓1$, $u↓2$, $v↓1$, $v↓2) ← (w↑\prime↓{1} - w↓1Q$, $w↑\prime↓{2}
- w↓2Q$, $w↓1$, $w↓2$, $z↑\prime↓{1}$, $z↑\prime↓{2}$, $z↓1 - Qz↑\prime↓{1}$,
$z↓2 - Qz↑\prime↓{2}$, $u↓2 - u↓1Q$, $u↓1$, $v↓2$, $v↓1 - Qv↓2)$ and
$n ← n + 1$. Go back to C2.\quad\blackslug
%folio 802 galley 1a (C) Addison-Wesley 1978	*
\def\\#1({\mathop{\hbox{#1}}(}\def\+#1\biglp{\mathop{\hbox{#1}}\biglp}
\yyskip This extension of Euclid's algorithm includes most
of the features we have seen in previous extensions, all at
the same time, so it provides new insight into the special cases
already considered. To prove that it is valid, note first that
deg$(v↓2)$ decreases in step C4, so the algorithm certainly
terminates. At the conclusion of the algorithm, $v↓1$ is a common
right divisor of $V↓1$ and $V↓2$, since $w↓1v↓1 = (-1)↑nV↓1$
and $-w↓2v↓1 = (-1)↑nV↓2$; also if $d$ is any common right divisor
of $V↓1$ and $V↓2$, it is a right divisor of $z↓1V↓1 + z↓2V↓2
= v↓1$. Hence $v↓1 =\\gcrd(V↓1, V↓2)$. Also if $m$ is any common
left multiple of $V↓1$ and $V↓2$, we may assume without loss
of generality that $m = U↓1V↓1 = U↓2V↓2$, since the sequence
of values of $Q$ does not depend on $U↓1$ and $U↓2$. Hence $m
= (-1)↑n(-u↓2z↑\prime↓{1})V↓1 = (-1)↑n(u↓2z↑\prime↓{2})V↓2$
is a multiple of $z↑\prime↓{1}V↓1$.

In practice, if we just want to calculate gcrd$(V↓1,
V↓2)$, we may suppress the computation of $n$, $w↓1$, $w↓2$, $w↑\prime↓{1}$,
$w↑\prime↓{2}$, $z↓1$, $z↓2$, $z↑\prime↓{1}$, $z↑\prime↓{2}$.
These additional quantities were added to the algorithm primarily
to make its validity more readily established.

{\sl Note:} Nontrivial factorizations of string
polynomials, such as the example given with this exercise, can
be found from matrix identities such as
$$\left({a \atop 1}\qquad{1\atop 0}\right)\left({b\atop 1}\qquad{1\atop
0}\right)\left({c\atop 1}\qquad{1\atop0}\right)\left({0\atop 1}\quad
{\quad1\atop -c}\right)\left({0\atop 1}\quad{\quad1\atop -b}\right)\left({0\atop
1}\quad{\quad1\atop -a}\right) = \left({1\qquad 0\atop 0\qquad 1}\right),$$
since this identities hold even when multiplication is not commutative.
For example,
$$(abc + a + c)(1 + ba) = (ab + 1)(cba + a + c).$$
(Compare this with the ``continuant polynomials'' of Section 4.5.3.)

\ansno 19. [Cf.\ Eug\`ene Cahen, {\sl Th\'eorie des Nombres \bf1} (Paris: A.
Hermann & fils, 1914), \hbox{336--338}.]\xskip
If such an algorithm
exists, $D$ is a gcrd by the argument in exercise 18. Let us
regard $A$ and $B$ as a single $2n \times n$ matrix $C$ whose
first $n$ rows are those of $A$, and whose second $n$ rows are
those of $B$. Similarly, $P$ and $Q$ can be combined into a
$2n \times n$ matrix $R$; $X$ and $Y$ can be combined into an
$n \times 2n$ matrix $Z$. The desired conditions now reduce
to two equations $C = RD$, $D = ZC$. If we can find a $2n \times
2n$ integer matrix $U$ of determinant $\pm 1$ such that the last
$n$ rows of $U↑{-1}C$ are all zero, then $R = ($first $n$ columns
of $U)$, $D = ($first $n$ rows of $U↑{-1}C)$, $Z = ($first $n$ rows
of $U↑{-1})$ solves the desired conditions. Hence, for example,
the following algorithm may be used (with $m=2n$):

\algbegin Algorithm T (Triangulation). Let $C$ be an $m \times n$ matrix
of integers. This algorithm finds $m \times m$ integer matrices
$U$ and $V$ such that $UV = I$ and $VC$ is ``upper triangular.''\xskip
(The entry in row $i$ and column $j$ of $VC$ is zero if $i >
j$.)

\algstep T1. [Initialize.] Set $U ← V ← I$, the
$m \times m$ identity matrix; and set $T ← C$.\xskip (Throughout the
algorithm we will have $T = VC$ and $UV = 1$.)

\algstep T2. [Iterate on $j$.] Do step T3 for $j = 1$, 2, $\ldotss
$, $\min(m, n)$, and terminate the algorithm.

\algstep T3. [Zero out column $j$.] Perform the following transformation
zero or more times until $T↓{ij}$ is zero for all $i > j$: Let
$T↓{kj}$ be a nonzero element of $\{T↓{ij}, T↓{(j+1)j}, \ldotss,
T↓{mj}\}$ having the smallest absolute value. Interchange
rows $k$ and $j$ of $T$ and of $V$; interchange columns $k$
and $j$ of $U$. Then subtract $\lfloor T↓{ij}/T↓{jj}\rfloor$
times row $j$ from row $i$, in matrices $T$ and $V$, and add
the same multiple of column $i$ to column $j$ in matrix $U$,
for $j < i ≤ m$.\quad\blackslug

\yyskip\noindent For the stated example, the
algorithm yields \def\\#1#2#3#4{({#1\atop#3}\;{#2\atop#4})}
$\\1234=\\1032\\1{\quad2}0{-1}$, $\\4321=\\4523\\1{\quad2}0{-1}$,
$\\1{\quad2}0{-1}=\\1{\quad0}2{-2}\\1234
+\\0010\\4321$.\xskip (Actually
{\sl any} matrix with determinant $\pm 1$ would be a gcrd in this particular case.)

\ansno 20. It may be helpful to consider exercise 4.6.2--22
with $p↑m$ replaced by a small number $ε$.

\ansno 21. Note that Algorithm R is used only when $m - n ≤
1$; furthermore, the coefficients are bounded by (25) with $m =
n$.\xskip [The stated formula is, in fact, the execution time observed
in practice, not merely an upper bound. For more detailed information
see G. E. Collins, {\sl Proc. 1968 Summer Inst.\ on Symbolic
Math.\ Comp.}, Robert G. Tobey, ed.\ (IBM Federal Systems Center,
June 1969), 195--231.]

\ansno 22. A sequence of signs cannot contain two consecutive
zeros, since $u↓{k+1}(x)$ is a nonzero constant in (29). Moreover
we cannot have ``+, 0, +'' or ``$-$, 0, $-$'' as subsequences. The formula
$V(u, a) - V(u, b)$ is clearly valid when $b = a$, so we must
only verify it as $b$ increases. The polynomials $u↓j(x)$ have
finitely many roots, and $V(u, b)$ changes only when $b$ encounters
or passes such roots. Let $x$ be a root of some (possibly several)
$u↓j$. When $b$ increases from $x - ε$ to $x$, the sign sequence
near $j$ goes from ``+, $\pm$, $-$'' to ``+, 0, $-$'' or from ``$-$, $\pm$, +''
to ``$-$, 0, +'' if $j > 0$; and from ``+, $-$'' to ``0, $-$'' or from ``$-$, +'' to
``0, +'' if $j = 0$.\xskip (Since $u↑\prime (x)$ is the derivative, $u↑\prime
(x)$ is negative when $u(x)$ is decreasing.)\xskip Thus the net change
in $V$ is $-\delta ↓{j0}$. When $b$ increases from $x$ to $x
+ ε$, a similar argument shows that $V$ remains unchanged.

[L. E. Heindel, {\sl JACM} {\bf 18} (1971), 533--548,
has applied these ideas to construct algorithms for isolating
the real zeroes of a given polynomial $u(x)$, in time bounded
by a polynomial in deg$(u)$ and log $N$, where all coefficients
$y↓j$ are integers with $|u↓j| ≤ N$, and all operations are
guaranteed to be exact.]

\ansno 23. If $v$ has $n - 1$ real roots occurring between the
$n$ real roots of $u$, then (by considering sign changes) $u(x)\mod
v(x)$ has $n - 2$ real roots lying between the $n - 1$ of $v$.

\ansno 24. First show that $h↓j=g↓{\!j}↑{\delta↓{j-1}}g↓{\!j-1}↑{\delta↓{j-2}
(1-\delta↓{j-1})}\ldotss g↓2↑{\delta↓1(1-\delta↓2)\ldotsm(1-\delta↓{j-1})}$.
Then show that the exponent of $g↓2$ on the left-hand side of (18) has the form
$\delta↓2+\delta↓1x$, where $x=\delta↓2+\cdots+\delta↓{j-1}+1-\delta↓2(\delta↓3+
\cdots+\delta↓{j-1}+1)-\delta↓3(1-\delta↓2)(\delta↓4+\cdots+\delta↓{j-1}+1)-\cdots
-{\delta↓{j-1}(1-\delta↓2)\ldotsm(1-\delta↓{j-2})(1)}$. But $x=1$, since
it is seen to be independent of $\delta↓{j-1}$ and we can set $\delta↓{j-1}=0$,
etc. A similar derivation works for $g↓3$, $g↓4$, $\ldotss$, and a simpler
derivation works for (23).

\ansno 25. Each coefficient of $u↓j(x)$ can be expressed as a determinant in
which one column contains only $\lscr(u)$, $\lscr(v)$, and zeros. To use
this fact, modify Algorithm C as follows: In step C1, set $g←\gcd\biglp\lscr(u),
\lscr(v)\bigrp$ and $h←0$.
In step C3, if $h=0$, set $u(x)←v(x)$, $v(x)←r(x)/g$, $h←\lscr(u)↑\delta/g$,
$g←\lscr(u)$, and return to C2; otherwise proceed as in the unmodified
algorithm. The effect of this new initialization is simply to replace
$u↓j(x)$ by $u↓j(x)/\!\gcd\biglp\lscr(u),\lscr(v)\bigrp$ for all $j≥3$;
thus, $\lscr↑{2j-4}$ will become $\lscr↑{2j-5}$ in (28).
%folio 804 galley 1b (C) Addison-Wesley 1978	*
\ansbegin{4.6.2}

\ansno 1. By the principle of inclusion
and exclusion (Section 1.3.3), the number of polynomials without
linear factors is $\sum ↓{k≤n} {p\choose k}p↑{n-k}(-1)↑k = p↑{n-p}(p
- 1)↑p$. The stated probability is therefore $1 - (1 - 1/p)↑p$,
which is greater than ${1\over 2}$.\xskip [In fact, the stated probability
is greater than ${1\over 2}$ for all $n ≥ 1$.]

\ansno 2. (a)\9 We know that $u(x)$ has a representation
as a product of irreducible polynomials; and the leading coefficients
of these polynomials must be units, since they divide the leading
coefficient of $u(x)$. Therefore we may assume that $u(x)$ has
a representation as a product of monic irreducible polynomials
$p↓1(x)↑{e↓1}\ldotsm p↓r(x)↑{e↓r}$, where $p↓1(x)$,
$\ldotss$, $p↓r(x)$ are distinct. This representation is unique,
except for the order of the factors, so the conditions on $u(x)$,
$v(x)$, $w(x)$ are satisfied if and only if
$$v(x) = p↓1(x)↑{\lfloor e↓1/2\rfloor} \ldotss p↓r(x)↑{\lfloor e↓r/2\rfloor},
\qquad w(x) = p↓1(x)↑{e↓1\mod2} \ldotss p↓r(x)↑{e↓r\mod2}.$$

(b)\9 The generating function for the number
of monic polynomials of degree $n$ is $1 + pz + p↑2z↑2 + \cdots
= 1/(1 - pz)$. The generating function for the number of polynomials
of degree $n$ having the form $v(x)↑2$, where $v(x)$ is monic,
is $1 + pz↑2 + p↑2z↑4 + \cdots = 1/(1 - pz↑2)$. If the generating
function for the number of monic squarefree polynomials of degree
$n$ is $g(z)$, then by part (a)  we must have $1/(1 - pz) = g(z)/(1 - pz↑2)$.
Hence $g(z) = (1 - pz↑2)/(1 - pz) = 1 + pz + (p↑2 - p)z↑2 +
(p↑3 - p↑2)z↑3 + \cdotss$. The answer is $p↑n - p↑{n-1}$ for
$n ≥ 2$.\xskip $\biglp$Curiously, this proves that $\gcd\biglp u(x), u↑\prime
(x)\bigrp = 1$ with probability $1 - 1/p$; it is the same as
the probability that $\gcd\biglp u(x), v(x)\bigrp = 1$ when $u(x)$
and $v(x)$ are {\sl independent}, by exercise 4.6.1--5.$\bigrp$

\ansno 3. Let $u(x) = u↓1(x) \ldotsm u↓r(x)$. There is {\sl at
most} one such $v(x)$, by the argument of Theorem 4.3.2C\null. There
is {\sl at least} one if, for each $j$, we can solve the system
with $w↓j(x) = 1$ and $w↓k(x) = 0$ for $k ≠ j$. A solution to
the latter is $v↓1(x)\prod↓{k≠j}u↓k(x)$, where $v↓1(x)$
and $v↓2(x)$ can be found satisfying
$$\textstyle v↓1(x)\prod↓{k≠j}u↓k(x)\,+\,v↓2(x)u↓j(x) = 1,\qquad
\hbox{deg}(v↓1) <\hbox{deg}(u↓j),$$
by the extension of Euclid's algorithm (exercise 4.6.1--3).
%folio 805 galley 2 (C) Addison-Wesley 1978	*
\ansno 4. The unique factorization theorem gives the identity
$(1 - pz)↑{-1} =\prod↓{n≥1} (1 - z↑n)↑{-a↓{np}}$;
after taking logarithms, this can be rewritten$$\textstyle\sum ↓{j≥1} G↓p(z↑j)/j
= \sum ↓{k,j≥1} a↓{kp}z↑{kj}/j = \ln\biglp 1/(1 - pz)\bigrp.$$
The stated identity now yields the answer $G↓p(z) = \sum ↓{m≥1}
\mu (m)m↑{-1}\ln\biglp 1/(1 - pz↑m)\bigrp $, from which we
obtain $a↓{np} = \sum ↓{d\rslash n} \mu (n/d)n↑{-1}p↑d$; thus $\lim↓{p→∞}
a↓{np}/p↑n = 1/n$. To prove the stated identity, note that $\sum
↓{n,j≥1}\mu (n)g(z↑{nj})n↑{-t}j↑{-t} = \sum ↓{m≥1} g(z↑m)m↑{-t}
\sum ↓{n\rslash m} \mu (n) = g(z)$.

\ansno 5. Let $a↓{npr}$ be the number of monic
polynomials of degree $n$ modulo $p$ having exactly $r$ irreducible
factors. Then $\Gscr↓p(z, w) = \sum ↓{n,r≥0} a↓{npr}z↑nw↑r =
\exp\biglp \sum ↓{k≥1} G↓p(z↑k)w↑k/k\bigrp$, where $G↓p$ is the generating
function in exercise 4; cf.\ Eq.\ 1.2.9--34.
We have $$\baselineskip15pt\eqalign{\textstyle\sum ↓{n≥0} A↓{np}z↑n
= d\Gscr↓p(z/p, w)/d↓{\null}w\,|↓{w=1}⊗=\textstyle
\biglp \sum ↓{k≥1}G↓p(z↑k/p↑k)\bigrp \exp\biglp\ln(1/(1
- z))\bigrp\cr⊗\textstyle=\biglp\sum↓{n≥1}\ln\biglp 1/(1-p↑{1-n}z↑n)\bigrp\varphi
(n)/n\bigrp/(1-z),\cr}$$ hence $A↓{np} = H↓n + 1/2p + O(p↑{-2})$ for
$n ≥ 2$. The average value of $2↑r$ is the coefficient of $z↑n$
in $\Gscr↓p(z/p, 2)$, namely $n + 1 + (n - 1)/p + O(p↑{-2})$.\xskip (The
variance is of order $n↑3$, however: set $w = 4$.)

\ansno 6. For $0 ≤ s < p$, $x - s$ is a factor of
$x↑p - x$ (modulo $p$) by Fermat's theorem. So $x↑p - x$ is
a multiple of $\lcm\biglp x - 0, x - 1, \ldotss , x - (p - 1)\bigrp=x↑{\underline p}
$.\xskip $\biglp${\sl Note:} Therefore the Stirling numbers
$p\comb[]k$ are multiples of $p$ except when $k = 1$, $k =
p$. Equation 1.2.6--41 shows that the same statement is valid
for Stirling numbers $p\comb\{\} k$ of the other kind.$\bigrp$

\ansno 7. The factors on the right are relatively
prime, and each is a divisor of $u(x)$, so their product divides
$u(x)$. On the other hand,
$$\textstyle u(x)\qquad\hbox{divides}\qquad v(x)↑p - v(x) = \prod ↓{0≤s<p}
\biglp v(x) - s\bigrp ,$$
so it divides the right-hand side by exercise 4.5.2--2.

\ansno 8. The vector (18) is the only output whose $k$th component is nonzero.

\ansno 9. For example, start with $x ← 1$, $y ← 1$;
then repeatedly set $R[x] ← y$, $x ←2x \mod 101$, $y ← 51y
\mod 101$, one hundred times.

\ansno 10. The matrix $Q - I$ below has a null space generated
by the two vectors $v↑{[1]} =(1, 0, 0, 0, 0, 0, 0, 0)$, $v↑{[2]}
= (0, 1, 1, 0, 0, 1, 1, 1)$. The factorization is
$$(x↑6 + x↑5 + x↑4 + x + 1)(x↑2 + x + 1).$$

\hbox to size{\qquad$\dispstyle{p=2\lower5pt\null\atop
\left(\,\vcenter{\halign{\hfill#⊗\hbox to 15pt{\hfill#}⊗\!
\hbox to 15pt{\hfill#}⊗\hbox to 15pt{\hfill#}⊗\hbox to 15pt{\hfill#}⊗\!
\hbox to 15pt{\hfill#}⊗\hbox to 15pt{\hfill#}⊗\hbox to 15pt{\hfill#}\cr
0⊗0⊗0⊗0⊗0⊗0⊗0⊗0\cr
0⊗1⊗1⊗0⊗0⊗0⊗0⊗0\cr
0⊗0⊗1⊗0⊗1⊗0⊗0⊗0\cr
0⊗0⊗0⊗1⊗0⊗0⊗1⊗0\cr
1⊗0⊗0⊗1⊗0⊗0⊗1⊗0\cr
1⊗0⊗1⊗1⊗1⊗0⊗0⊗0\cr
0⊗0⊗1⊗0⊗1⊗1⊗0⊗1\cr
1⊗1⊗0⊗1⊗1⊗1⊗0⊗1\cr}}\,\right)}\hfill{p=5\lower5pt\null\atop
\left(\,\vcenter{\halign{\hfill#⊗\hbox to 15pt{\hfill#}⊗\!
\hbox to 15pt{\hfill#}⊗\hbox to 15pt{\hfill#}⊗\!
\hbox to 15pt{\hfill#}⊗\hbox to 15pt{\hfill#}⊗\hbox to 15pt{\hfill#}\cr
0⊗0⊗0⊗0⊗0⊗0⊗0\cr
0⊗4⊗0⊗0⊗0⊗1⊗0\cr
0⊗2⊗2⊗0⊗4⊗3⊗4\cr
0⊗1⊗4⊗4⊗4⊗2⊗1\cr
2⊗2⊗2⊗3⊗4⊗3⊗2\cr
0⊗0⊗4⊗0⊗1⊗3⊗2\cr
3⊗0⊗2⊗1⊗4⊗2⊗1\cr}}\,\right)}$\qquad}

\ansno 11. Removing the trivial factor
$x$, the matrix $Q - I$ above has a null space generated by
$(1, 0, 0, 0, 0, 0, 0)$ and $(0, 3, 1, 4, 1, 2, 1)$. The factorization
is
$$x(x↑2 + 3x + 4)(x↑5 + 2x↑4 + x↑3 + 4x↑2 + x + 3).$$

\ansno 12. If $p = 2$, $(x + 1)↑4 = x↑4 + 1$. If $p
= 8k + 1$, $Q - I$ is the zero matrix, so there are four factors.
For other values of $p$ we have
$$\lower18pt\hbox{$Q-I=\null$}
{p=8k+3\lower5pt\null\atop
\left(\,\vcenter{\halign{\hfill#⊗\hbox to 20pt{\hfill$#$}⊗\!
\hbox to 20pt{\hfill$#$}⊗\hbox to 20pt{\hfill$#$}⊗\hbox to 20pt{\hfill$#$}\cr
0⊗0⊗0⊗0\cr0⊗-1⊗0⊗1\cr0⊗0⊗-2⊗0\cr0⊗1⊗0⊗-1\cr}}\,\right)}
{p=8k+5\lower5pt\null\atop
\left(\,\vcenter{\halign{\hfill#⊗\hbox to 20pt{\hfill$#$}⊗\!
\hbox to 20pt{\hfill$#$}⊗\hbox to 20pt{\hfill$#$}⊗\hbox to 20pt{\hfill$#$}\cr
0⊗0⊗0⊗0\cr0⊗-2⊗0⊗0\cr0⊗0⊗0⊗0\cr0⊗0⊗0⊗-2\cr}}\,\right)}
{p=8k+7\lower5pt\null\atop
\left(\,\vcenter{\halign{\hfill#⊗\hbox to 20pt{\hfill$#$}⊗\!
\hbox to 20pt{\hfill$#$}⊗\hbox to 20pt{\hfill$#$}⊗\hbox to 20pt{\hfill$#$}\cr
0⊗0⊗0⊗0\cr0⊗-1⊗0⊗-1\cr0⊗0⊗-2⊗0\cr0⊗-1⊗0⊗-1\cr}}\,\right)}$$
Here $Q - I$ has rank 2, so there are $4
- 2 = 2$ factors.\xskip $\biglp$But it is easy to prove that $x↑4 + 1$ is
irreducible over the integers, since it has no linear factors
and the coefficient of $x$ in any factor of degree two must
be less than or equal to 2 in absolute value by exercise 20.
For all $k ≥ 2$, H. P. F. Swinnerton-Dyer has exhibited polynomials
of degree $2↑k$ that are irreducible over the integers, but
they split completely into linear and quadratic factors modulo
every prime. For degree 8, his example is $x↑8 - 16x↑6 + 88x↑4
+192x↑2 + 144$, having roots $\pm\sqrt 2\pm\sqrt3\pm i$ [see {\sl Math.\
Comp.\ \bf24} (1970), 733--734]. According to the theorem of Frobenius cited
in exercise 33, any irreducible polynomial of degree $n$
whose Galois group contains no $n$-cycles will have factors modulo almost
all primes.$\bigrp$

\ansno 13. $p=8k+1$:
${\biglp x+(1+\sqrt{-1})/\sqrt2\bigrp}\*
{\biglp x+(1-\sqrt{-1})/\sqrt2\bigrp}\*
{\biglp x-(1+\sqrt{-1})/\sqrt2\bigrp}\*
{\biglp x-(1-\sqrt{-1})/\sqrt2\bigrp}$.\xskip
$p=8k+3$: ${\biglp x↑2-\sqrt{-2}x-1\bigrp}\*
{\biglp x↑2-\sqrt{-2}x-1\bigrp}$.\xskip
$p=8k+5$: ${\biglp x↑2-\sqrt{-1}\bigrp}\*
{\biglp x↑2-\sqrt{-1}\bigrp}$.\xskip
$p=8k+7$: ${\biglp x↑2-\sqrt2x+1\bigrp}\*
{\biglp x↑2+\sqrt2x+1\bigrp}$. The latter factorization also holds over the
field of real numbers.

\ansno 14. The gcd is 1 when $(s↓i+t)↑{(p-1)/2}≡0$ or $-1$ for all $i$;
it is $w(x)$ when $(s↓i+t)↑{(p-1)/2}≡1$ for all $i$. To get a lower bound
on $P$, let $k=2$. The ``bad'' values of $t$ satisfy $t≡-s↓1$ or $t≡-s↓2$
or $\biglp(s↓1+t)/(s↓2+t)\bigrp↑{(p-1)/2}≡1$. Furthermore $(s↓1+t)/(s↓2+t)$
runs through all values except 0 and 1 as $t$ runs through all values other
than $-s↓1$ and $-s↓2$, modulo\penalty999\
$p$. Hence there are at most $2+(p-1)/2-1$
bad values.  $\biglp$If $t≡-s↓1$ and $t≡-s↓2$ are both bad, then $(-1)↑{(p-1)/2}≡1$;
thus $P≥1/2+1/(2p)$ when $p≡3\modulo4$.$\bigrp$

{\sl Notes:} It follows that with probability $>1-ε$ we will find a root of
$w(x)$ modulo $p$ in $O\biglp(\log(1/ε))(\log k)(k↑2(\log p)↑3+k↑3(\log p)↑2)\bigrp$
units of time.\xskip
$\biglp$The factor $\log(1/ε)$ is the number of independent trials
needed per reduction, while $\log k$ is the maximum number of reductions, since the
degree is at worst halved when a nontrivial factorization is found;
$k↑2(\log p)↑3$ units of time will evaluate $(x+t)↑{(p-1)/2}\mod w(x)$, while
$k↑3(\log p)↑2$ suffice to compute the gcd.$\bigrp$\xskip
On the other hand, the true behavior is probably better than these ``worst-case''
estimates. Suppose that each linear factor $x-s↓i$ has probability $1\over2$ of
dividing $(x+t)↑{(p-1)/2}-1$ for each $t$, independent of the behavior for
other $s↓j$ and $t$. Then if we encode each $s↓i$ by a sequence of 0's and 1's
according as $x-s↓i$ does or doesn't divide $(x+t)↑{(p-1)/2}-1$ for the successive
$t$'s tried, we obtain a random binary trie with $k$ lieves (cf.\ Section 6.3).
The cost associated with an internal node of this trie, having $d$ lieves as
descendants, is $O\biglp d↑2(\log p)↑3+d↑3(\log p)↑2\bigrp$. The solution to
the recurrences $A↓n={n\choose2}+2↑{1-n}\sum{n\choose k}A↓k$,
$B↓n={n\choose3}+2↑{1-n}\sum{n\choose k}B↓k$, is $A↓n=2{n\choose2}$,
$B↓n={4\over3}{n\choose3}$, by exercise 5.2.2--36. Hence the sum of costs
in the given random trie---representing the time to factor $w(x)$ {\sl
completely}---is $O\biglp k↑2(\log p)↑3+k↑3(\log p)↑2\bigrp$ under this
plausible assumption.

\ansno 15. We may assume that $u≠0$ and that $p$ is odd. Berlekamp's method applied
to the polynomial $x↑2-u$ tells us that a square root exists if and only if
$u↑{(p-1)/2}\mod p=1$; let us assume that this condition holds.

Let $p-1=2↑e\cdot q$, where $q$ is odd. Zassenhaus's factoring procedure suggests
the following square-root extraction algorithm: Set $t←0$. Evaluate
$$\twoline{\gcd\biglp(x+t)↑q-1,x↑2-u\bigrp,\;\gcd\biglp(x+t)↑{2q}-1,x↑2
-u\bigrp,}{2pt}{\gcd\biglp(x+t)↑{4q}-1,x↑2-u\bigrp,\;\gcd\biglp(x+t)↑{8q}-1,
x↑2-u\bigrp,\;\ldotss,}$$
until finding the first case where the gcd is not 1 (modulo $p$). If the
gcd is $x-v$, then $\sqrt u = \pm v$. If the gcd is $x↑2-u$, set $t←t+1$ and
repeat the calculation.

{\sl Notes:} If $(x+t)↑k\mod(x↑2-u)=ax+b$, then we have $(x+t)↑{k+1}\mod(x↑2-u)=
(b+at)x+(bt+au)$, and $(x+t)↑{2k}\mod(x↑2-u)=2abx+(b↑2+a↑2u)$; hence
$(x+t)↑q$, $(x+t)↑{2q}$, $\ldots$ are easy to evaluate efficiently, and the
calculation for fixed $t$ takes $O\biglp(\log p)↑3\bigrp$ units of time.
The square root will be found when $t=0$ with probability $1/2↑{e-1}$;
thus it will always be found immediately when $p\mod4=3$. If we choose
$t$ at random instead of increasing it sequentially, exercise 14 gives a
rigorous proof that each $t$ gives success at least about half of the time;
but for practical purposes this random choice isn't needed.

Another square-root method has been suggested by D. Shanks. When $e>1$ it
requires an auxiliary constant $z$ (depending only on $p$) such that
$z↑{2↑{e-1}}≡-1\modulo p$. The value $z=n↑q\mod p$ will work for almost
one half of all integers $n$; once $z$ is known, the following algorithm
requires no more probabilistic calculation:

\yskip\hang\textindent{\bf S1.}Set $y←z$, $r←e$, $v←u↑{(q+1)/2}\mod p$,
$w←u↑q\mod p$.

\yskip\hang\textindent{\bf S2.}If $w=1$, stop; $v$ is the answer. Otherwise
find the smallest $k$ such that $w↑{2↑k}\mod p$ is equal to 1.
If $k=r$, stop (there is
no answer); otherwise set$$(y,r,v,w)←(y↑{2↑{r-k}},k,vy↑{2↑{r-k-1}},wy↑{2↑{r-k}})$$
and repeat step S2.\quad\blackslug

\yyskip The validity of this algorithm follows from the invariant congruences
$uw≡v↑2$, $y↑{2↑{r-1}}≡-1$, $w↑{2↑{r-1}}≡1\modulo p$. On the average, step S2
will require about ${1\over4}e↑2$ multiplications mod $p$.\xskip
Reference: {\sl Proc.\ Second Manitoba Conf.\ Numer.\ Math.\ }(1972),
\hbox{58--62.} A related method was published by A. Tonelli, {\sl G\"ottinger
Nachrichten} (1891), 344--346.
%folio 810 galley 3 (C) Addison-Wesley 1978	*
\ansno 16. (a)\9 Substitute polynomials modulo $p$ for integers,
in the proof for $n = 1$.\xskip (b) The proof for $n=1$
carries over to any finite field.\xskip (c) Since $x = \xi ↑k$ for
some $k$, $x↑{p↑n} = x$ in the field defined by $f(x)$. Furthermore, the elements
$y$ that satisfy the equation $y↑{p↑m} = y$ in the field
are closed under addition, and closed under multiplication;
so if $x↑{p↑m}= x$, then $\xi$ (being a polynomial in
$x$ with integer coefficients) satisfies $\xi↑{p↑m} =
\xi $.

\ansno 17. If $\xi$ is a primitive root, each nonzero element
is some power of $\xi $. Hence the order must be a divisor of
$13↑2 - 1 = 2↑3 \cdot 3 \cdot 7$, and $\varphi (f)$ elements have
order $f$.
\def\\#1{\hbox{$\hfill\hskip-10pt#1\hskip-10pt\hfill$}\hfill}
$$\vbox{\halign{\hfill#⊗\qquad\hfill#⊗\qquad\qquad\hfill#⊗\qquad\hfill#⊗\!
\qquad\qquad\hfill#⊗\qquad\hfill#⊗\qquad\qquad\hfill#⊗\qquad\hfill#\cr
\\f⊗\\{\varphi(f)}⊗\\f⊗\\{\varphi(f)}⊗\\f⊗\\{\varphi(f)}⊗\\f⊗\\{\varphi(f)}\cr
\noalign{\vskip3pt}
1⊗1⊗3⊗2⊗7⊗6⊗21⊗12\cr
2⊗1⊗6⊗2⊗14⊗6⊗42⊗12\cr
4⊗2⊗12⊗4⊗28⊗12⊗84⊗24\cr
8⊗4⊗24⊗8⊗56⊗24⊗168⊗48\cr}}$$

\ansno 18. (a)\9 pp$\biglp p↓1(u↓nx)\bigrp \ldotsm\hbox{pp}
\biglp p↓r(u↓nx)\bigrp$, by Gauss's lemma. For example, let
$$u(x) = 6x↑3 - 3x↑2 + 2x - 1,\qquad v(x) = x↑3 - 3x↑2 + 12x
- 36 = (x↑2 + 12)(x - 3);$$
then pp$(36x↑2 + 12) = 3x↑2 + 1$, pp$(6x - 3) =
2x - 1$.\xskip (This is a modern version of a fourteenth-century trick
used for many years to help solve algebraic equations.)

(b)\9 Let pp$\biglp w(u↓nx)\bigrp = \=w↓mx↑m
+ \cdots + \=w↓0 = w(u↓nx)/c$, where $c$ is the content
of $w(u↓nx)$ as a polynomial in $x$. Then $w(x) = (c\=w/u↑{m}↓{n})x↑m
+ \cdots + c\=w↓0$, hence $c\=w↓m = u↑{m}↓{n}$;
since $\=w↓m$ is a divisor of $u↓n$, $c$ is a multiple of $u↑{m-1}↓{n}$.

\ansno 19. If $u(x) = v(x)w(x)$ with deg$(v)$deg$(w) ≥
1$, then $u↓nX↑n ≡ v(x)w(x) \modulo p$. By unique factorization
modulo $p$, all but the leading coefficients of $v$ and $w$ are multiples of $p$,
and $p↑2$ divides $v↓0w↓0=u↓0$.

\ansno 20. (a)\9 $\sum (αu↓j - u↓{j-1})(\=α
{\=u}↓j - {\=u}↓{j-1}) = \sum (u↓j - \=αu↓{j-1})({\=u}↓j
- α{\=u}↓{j-1})$.\xskip (b) We may assume that $u↓0 ≠ 0$. Let
$m(u) = \prod↓{1≤j≤n} \min(1, |α↓j|) = u↓0/M(u)u↓n$. Whenever
$|α↓j| < 1$, change the factor $x - α↓j$ to ${\=α}↓jx -
1$ in $u(x)$; this doesn't affect $|u|$. Now looking at the
leading and trailing coefficients only, we have $|u|↑2 ≥ |u↓n|↑2m(u)↑2
+ |u↓n|↑2M(u)↑2$; hence we obtain the slightly stronger result
$M(u)↑2 ≤ \biglp |u|↑2 + (|u|↑4 - 4|u↓0u↓n|↑2)↑{1/2}\bigrp
/2|u↓n|↑2$.\xskip (c) $u↓j = u↓m \sum α↓{i↓1}\ldotsm α↓{i↓{m-j}}$,
an elementary symmetric function, hence
$|u↓j| ≤ |u↓m| \sum β↓{i↓1} \ldotsm β↓{i↓{m-j}}$
where $β↓i = \max(1, |α↓i|)$. We complete the proof by showing
that when $x↓1≥1$, $\ldotss$, $x↓n≥1$, and $x↓1 \ldotsm x↓n = M$,
the elementary symmetric function $\sigma ↓{nk} = \sum x↓{i↓1}
\ldotsm x↓{i↓k}$ is $≤{n-1\choose k-1}M + {n-1\choose k}$,
the value assumed when $x↓1 = \cdots = x↓{n-1} = 1$ and $x↓n
= M$.\xskip$\biglp$For if $x↓1 ≤ \cdots ≤ x↓n < M$, the transformation $x↓n
← x↓{n-1}x↓n$, $x↓{n-1} ← 1$ increases $\sigma ↓{nk}$ by $\sigma
↓{(n-2)(k-1)}(x↓n - 1)(x↓{n-1} - 1) > 0$.$\bigrp$\xskip (d) $|v↓j| ≤ |v↓m|
\mathopen{\vcenter{\hbox{\:@\char'0}}}{m-1\choose j}M(v)+
{m-1\choose j-1}\mathclose{\vcenter{\hbox{\:@\char'1}}}≤|u↓n|
\mathopen{\vcenter{\hbox{\:@\char'0}}}{m-1\choose j}M(u)+
{m-1\choose j-1}\mathclose{\vcenter{\hbox{\:@\char'1}}}$
since $|v↓m| ≤ |u↓n|$ and
$M(v) ≤ M(u)$.\xskip [These results are slight extensions of inequalities
due to M. Mignotte, {\sl Math.\ Comp.\ \bf 28} (1974), 1153--1157. See also
J. Vicente Gon\c calves, {\sl Revista de Faculdade de Ci\A encias} (2) A
{\bf1} (Univ. Lisbon, 1950), 167--171.]

\ansno 21. (a)\9 $\int↑{1}↓{0}\biglp u↓ne(n\theta ) + \cdots + u↓0)({\=u}↓n
e(-n\theta ) + \cdots + {\=u}↓0)\,d\theta = |u↓n|↑2 +
\cdots + |u↓0|↑2$ since $\int ↑{1}↓{0} e(j\theta )e(-k\theta
)\,d\theta = \delta↓{jk}$; now use induction on $t$.\xskip (b) Since
$|v↓j| ≤ {m\choose j}M(v)|v↓m|$ we conclude that $|v|↑2 ≤ {2m\choose
m}M(v)↑2|v↓m|↑2$. Hence $|v|↑2|w|↑2 ≤ {2m\choose m}{2k\choose
k}M(v)↑2M(w)↑2|v↓mw↓k|↑2 = f(m, k)M(u)↑2|u↓n|↑2 ≤ f(m, k)|u|↑2$.\xskip
[Slightly better values for $f(m, k)$ are possible based on
the more detailed information in exercise 20.]\xskip (c) The case
$t = 3$ suffices to show how to get from $t - 1$ to $t$. When
$t = 2$ we have shown that, for all $\theta ↓1$, $$\twoline{\textstyle\int ↑{1}↓{0}
\int ↑{1}↓{0} \int ↑{1}↓{0} \int ↑{1}↓{0}\left|v\biglp e(\theta
↓1), e(\phi ↓2), e(\phi ↓3)\bigrp\right|↑2\left|w\biglp e(\theta ↓1),
e(\psi ↓2), e(\psi ↓3)\bigrp \right|↑2\,d\phi ↓2\,d\phi ↓3\,d\psi ↓2\,
d\psi ↓3}{3pt}{\textstyle≤ f(m↓2, k↓2)f(m↓3, k↓3) \int ↑{1}↓{0} \int ↑{1}↓{0}
\left|v\biglp e(\theta ↓1), e(\theta ↓2), e(\theta ↓3)\bigrp \right|↑2
\left|w\biglp e(\theta ↓1), e(\theta ↓2), e(\theta ↓3)\bigrp \right|↑2
\,d\theta ↓2\,d\theta ↓3.}$$ For all $\phi ↓2$, $\phi ↓3$, $\psi ↓2$,
$\psi ↓3$ we have also shown that$$\twoline{\textstyle\int ↑{1}↓{0} \int ↑{1}↓{0}
\left|v\biglp e(\phi ↓1), e(\phi ↓2), e(\phi ↓3)\bigrp \right|↑2\left|w\biglp
e(\psi ↓1), e(\psi ↓2), e(\psi ↓3)\bigrp \right|↑2\,d\phi ↓1\,d\psi
↓1}{3pt}{\textstyle≤ f(m↓1, k↓1) \int ↑{1}↓{0}\left|v\biglp e(\theta ↓1), e(\phi
↓2), e(\phi ↓3)\bigrp \right|↑2\left|w\biglp e(\theta ↓1), e(\psi ↓2),
e(\psi ↓3)\bigrp \right|↑2\, d\theta ↓1.}$$Integrate the former inequality
with respect to $\theta ↓1$ and the latter with respect to $\phi
↓2$, $\phi ↓3$, $\psi ↓2$, $\psi ↓3$.\xskip [This method was used by A.
O. Gel'fond in {\sl Transcendental and Algebraic Numbers} (New
York: Dover, 1960), Section 3.4, to derive a slightly different
result.]

\def\\{\hbox{deg}}
\ansno 22. More generally, assume that $u(x) ≡ v(x)w(x)\modulo
q$, $a(x)v(x) + b(x)w(x) ≡ 1 \modulo p$, and $c\lscr(v) ≡ 1
\modulo r$, $\\(a) < \\(w)$, $\\(h) < \\(v)$, $\\(u)
= \\(v) + \\(w)$, where $r = \gcd(p, q)$ and $p, q$ needn't
be prime. We shall construct polynomials $V(x) ≡ v(x)$ and $W(x)
≡ w(x) \modulo q$ such that $u(x) ≡ V(x)W(x) \modulo {qr}$,
$\lscr(V) = \lscr(v)$, $\\(V) = \\(v)$ $\\(W) =\\(w)$; furthermore,
if $r$ is prime, the results will be unique modulo $qr$.

The problem asks us to find $\=v(x)$ and
$\=w(x)$ with $V(x) = v(x) + q\=v(x)$, $W(x)
= w(x) + q\=w(x)$, $\\(\=v) < \\(v)$, $\\(\=w)≤
\\(w)$; and the other condition $\biglp v(x) + q\=v
(x)\bigrp\biglp w(x) + q\=w(x)\bigrp ≡ u(x)\modulo
{qr}$ is equivalent to $\=w(x)v(x) + \=v(x)w(x)
≡ f(x)\modulo r$, where $u(x) ≡ v(x)w(x) + qf(x) \modulo
{qr}$. We have $\biglp a(x)f(x) + t(x)w(x)\bigrp v(x) + \biglp
b(x)f(x) - t(x)v(x)\bigrp w(x) ≡ f(x) \modulo r$ for all
$t(x)$. Since $\lscr(v)$ has an inverse modulo $r$, we can find
a quotient $t(x)$ by Algorithm 4.6.1D such that $\\(bf - tv)
< \\(v)$; for this $t(x)$, $\\(af + tw) ≤ \\(w)$, since
$\\(f) ≤ \\(u) = \\(v) + \\(w)$. Thus the desired
solution is $\=v(x) = b(x)f(x) - t(x)v(x) = b(x)f(x)\mod
v(x)$, $\=w(x) = a(x)f(x) + t(x)w(x)$. If $\biglp \={\=v}
(x), \={\=w}(x)\bigrp$ is another
solution, we have $\biglp \=w(x) - \={\=w}(x)\bigrp
v(x) ≡ \biglp \={\=v}(x) - \=v(x)\bigrp w(x)
\modulo r$. Thus if $r$ is prime, $v(x)$ must divide $\={\=v}
(x) - \=v(x)$; but $\\(\={\=v}
- \=v) < \\(v)$, so $\={\=v}(x) = 
\=v(x)$ and $\={\=w}(x) = \=w(x)$.

For $p = 2$, the factorization proceeds as follows
(writing only the coefficients, and using bars for negative
digits): Exercise 10 says that $v↓1(x) = (\overline1\,\overline
1\,\overline1)$, $w↓1(x) = (\overline1\,\overline1\,\overline1\,
0\,0\,\overline1\,\overline1)$ in one-bit two's complement notation.
Euclid's extended algorithm yields $a(x) = (1\,0\,0\,0\,0\,1), b(x)
= (1\,0)$. The factor $v(x) = x↑2 + c↓1x + c↓0$ must have $|c↓1|
≤ \lfloor 1 + \sqrt{113}\rfloor = 11$, $|c↓0| ≤ 10$, by exercise
20. Three applications of Hensel's lemma yield $v↓4(x) = (1\,
3\,\overline1)$, $w↓4(x) = (1\,\overline3\,\overline5\,\overline4\,
4\,\overline3\,5)$. Thus $c↓1 ≡ 3$ and $c↓0 ≡ -1\modulo {16}$;
the only possible quadratic factor of $u(x)$ is $x↑2 + 3x -
1$. Division fails, so $u(x)$ is irreducible.\xskip$\biglp$Since we have
now proved the irreducibility of this beloved polynomial by
four separate methods, it is unlikely that it has any factors.$\bigrp$

Hans Zassenhaus has observed that we can often
speed up such calculations by increasing $p$ as well as $q$:
In the above notation, we can find $A(x)$, $B(x)$ such that $A(x)V(x)
+ B(x)W(x) ≡ 1 \modulo {pr}$, namely by taking $A(x) = a(x)
+ p\=a(x)$, $B(x) = b(x) + p\=b(x)$, where $\=a
a(x)V(x) + \=b(x)W(x) ≡ g(x)\modulo r$, $a(x)V(x)
+ b(x)W(x) ≡ 1 - pg(x) \modulo {pr}$. We can also find $C$
with $\lscr(V)C ≡ 1\modulo {pr}$. In this way we can lift a
squarefree factorization $u(x) ≡ v(x)w(x) \modulo p$ to its
unique extensions modulo $p↑2$, $p↑4$, $p↑8$, $p↑{16}$, etc. However,
this ``accelerated'' procedure reaches a point of diminishing
returns in practice, as soon as we get to double-precision moduli,
since the time for multiplying multiprecision numbers in practical
ranges outweighs the advantage of squaring the modulus directly.
From a computational standpoint it seems best to work with the
successive moduli $p$, $p↑2$, $p↑4$, $p↑8$, $\ldotss$, $p↑E$, $p↑{E+e}$,
$p↑{E+2e}$, $p↑{E+3e}$, $\ldotss $, where $E$ is the smallest power
of 2 with $p↑E$ greater than single precision and $e$ is the
largest integer such that $p↑e$ has single precision.

Hensel's lemma, which he introduced in order to
demonstrate the factorization of polynomials over the field
of $p$-adic numbers (see exercise 4.1--31), can be generalized
in several ways. First, if there are more factors, say $u(x)
≡ v↓1(x)v↓2(x)v↓3(x) \modulo p$, we can find $a↓1(x)$, $a↓2(x)$,
$a↓3(x)$ such that $a↓1(x)v↓2(x)v↓3(x) + a↓2(x)v↓1(x)v↓3(x) +
a↓3(x)v↓1(x)v↓2(x) ≡ 1 \modulo p$, $\\(a↓i) < \\(v↓i)$.\xskip
$\biglp$In essence, $1/u(x)$ is expanded in partial fractions as $\sum
a↓i(x)/v↓i(x)$.$\bigrp$\xskip An exactly analogous construction now allows
us to lift the factorization without changing the leading coefficients
of $v↓1$ and $v↓2$; we take $\=v↓1(x) = a↓1(x)f(x)\mod
v↓1(x)$, $\=v↓2(x) = a↓2(x)f(x)\mod v↓2(x)$, etc. Another
important generalization is to several simultaneous moduli,
of the respective forms $p↑e$, $(x↓2 - a↓2)↑{n↓2}$, $\ldotss
$, $(x↓t - a↓t)↑{n↓t}$, when performing multivariate gcds
and factorizations. Cf.\ D. Y. Y. Yun, Ph.D. Thesis (M.I.T.,
1974).

\ansno 23. The discriminant of pp$\biglp u(x)\bigrp$ is a nonzero integer (cf.\
exercise 4.6.1--12), and there are multiple factors modulo $p$
iff $p$ divides the discriminant.\xskip$\biglp$The factorization of (21)
modulo 3 is $(x + 1)(x↑2 - x - 1)↑2(x↑3 + x↑2 - x + 1)$; squared
factors for this polynomial occur only for $p = 3$, 23, 233
and 121702457. It is not difficult to prove that the smallest
prime that is not unlucky is at most $O(n\log Nn)$, if $n
= \\(u)$ and $N$ bounds the coefficients of $u(x)$.$\bigrp$

%folio 818 galley 4a (C) Addison-Wesley 1978	*
\ansno 24. Multiply a monic polynomial with rational coefficients
by a suitable nonzero integer, to get a primitive polynomial
over the integers. Factor this polynomial over the integers,
and then convert the factors back to monic.\xskip (No factorizations
are lost in this way; see exercise 4.6.1--8.)

\ansno 25. Consideration of the constant term shows there are
no factors of degree 1, so if the polynomial is reducible, it
must have one factor of degree 2 and one of degree 3. Modulo
2 the factors are $x(x + 1)↑2(x↑2+x+1)$; this is not much help.
Modulo 3 the factors are $(x + 2)↑2(x↑3+2x+2)$. Modulo 5 they
are $(x↑2 + x + 1)(x↑3 + 4x + 2)$. So we see that the answer
is $(x↑2 + x + 1)(x↑3 - x + 2)$.

\ansno 26. Begin with $D ← (0 \ldotsm 0 1)$, representing the
set $\{0\}$. Then for $1 ≤ j ≤ r$, set $D ← D ∨ (D\lsh d↓j)$,
where $∨$ denotes logical ``or'' and $D\lsh d$ denotes $D$ shifted left
$d$ bit positions.\xskip$\biglp$Actually we need only work with a bit
vector of length $\lceil(n + 1)/2\rceil$, since $n - m$ is in the set iff
$m$ is.$\bigrp$

\ansno 27. Exercise 4 says that a random polynomial of degree
$n$ is irreducible modulo $p$ with rather low probability, about
$1/n$. But the Chinese remainder theorem implies that a random
monic polynomial of degree $n$ over the integers will be reducible
with respect to each of $k$ distinct primes with probability
about $(1 - 1/n)↑k$, and this approaches zero as $k → ∞$. Hence
almost all polynomials over the integers are irreducible with
respect to infinitely many primes; and almost all primitive
polynomials over the integers are irreducible.\xskip [Another proof
has been given by W. S. Brown, {\sl AMM \bf 70} (1963), 965--969. See
also the generalization cited in the answer to exercise 33.]

\def\\{\hbox{deg}}
\ansno 28. False, we lose all $p↓j$ with $e↓j$ divisible by
$p$. True if $p ≥ \\(u)$.

\ansno 29. Compute $V↓1(x) = \gcd\biglp v(x), v↑\prime (x)\bigrp$;
this is legitimate since $v↓1(x)$ is relatively prime to $v↑\prime
(x)/v↓1(x)$. Let $v↓0(x) = v(x)/v↓1(x)$, the (squarefree) product
of all irreducible factors of $v(x)$. Compute $d↓1(x) = \gcd\biglp
u(x), v↓0(x)\bigrp$ and $u↓1(x) = u(x)/d↓1(x)$. If $\\(d↓j)
> 0$ for $j ≥ 1$, compute $\=d↓{j+1}(x) = \gcd\biglp
d↓j(x), v↓j(x)\bigrp$, $d↓{j+1}(x) = \gcd\biglp \=d↓{j+1}(x),
u↓j(x)\bigrp$, $v↓{j+1}(x) = v↓j(x)/\=d↓{j+1}(x)$, $u↓{j+1}(x)
= u↓j(x)/d↓{j+1}(x)$; but if deg$(d↓1) = 0$, terminate
the computation with the answer $d(x) = d↓1(x) \ldotsm d↓j(x)$.\xskip
$\biglp$ In this method, $d↓j(x)$ is the squarefree product of
all irreducible factors that occur $≥j$ times in $\gcd\biglp
u(x), v(x)\bigrp$. There are several ways to avoid redundant
calculations; for example, the same $p$ can be used in the underlying
gcd computations, and the gcd routine should return the values
of $u(x)/d(x)$ and $v(x)/d(x)$ that it computes. Furthermore
the computation is unsymmetric in $u$ and $v$; it seems better
to interchange the r\A oles of $u$ and $v$ if $\\(u)>\\(v)$
at the beginning or if $\\(u↓j) >\\(v↓j)$ when
computing $d↓{j+1}(x)$.$\bigrp$

 \ansno 30. Cf.\ exercise 4; the probability is the coefficient
of $z↑n$ in ${(1+a↓{1p}z/p)}\*
{(1 + a↓{2p}z↑2/p↑2)}\*{(1 + a↓{3p}z↑3/p↑3)}\ldotsm $, which has the
limiting value $g(z) = {(1 + z)}\*{(1 + {1\over 2}z↑2)}\*
{(1 + {1\over 3}z↑3)}\ldotsm\,$. For $1 ≤ n ≤ 10$ the answers are 1, ${1\over
2}$, ${5\over 6}$, ${7\over 12}$, ${37\over 60}$, ${79\over
120}$, ${173\over 280}$, ${101\over 168}$, ${127\over 210}$, ${1033\over
1680}$.\xskip$\biglp$Let
$f(y) = \ln(1 + y) - y = O(y↑2)$. We have $$\textstyle g(z) = \exp\biglp \sum
↓{n≥1} z↑n/n + \sum ↓{n≥1} f(z↑n/n)\bigrp =
h(z)/(1 - z),$$ and it can be shown that the limiting probability
is $h(1) =
\exp\biglp \sum ↓{n≥1} f(1/n)\bigrp =e↑{-\gamma}\approx .56146$
as $n → ∞$ [cf.\ D. H. Lehmer, {\sl Acta Arith.\ \bf 21} (1972),
379--388]. Indeed, N. G. de Bruijn has established the asymptotic
formula $\lim↓{p→∞} a↓{np} =
e↑{-\gamma}+ e↑{-\gamma}/n + O(n↑{-2}\log n)$.$\bigrp$

\ansno 31. One can do arithmetic in a field $F$ of $p↑d$ elements by letting the
elements of $F$ be polynomials $s(y)$, modulo $p$ and modulo any given
irreducible polynomial $q(y)$ of degree $d$. Since every irreducible factor
$h(x)$ of $g(x)$ is a divisor of $x↑{p↑d}-x=\prod\leftset x-s(y)\relv
s(y)\in F\rightset$, $h(x)$ must have {\sl linear} factors when regarded as
a polynomial over $F$. If $h\biglp s(y)\bigrp=0$ then also $h\biglp s(y)↑p\bigrp
=0$ and $s(y)↑{p↑k}≠s(y)$ for $1≤k<d$; hence the complete factorization of
$h(x)$ is $\biglp x-s(y)\bigrp\biglp x-s(y)↑p\bigrp\ldotsm\biglp x-s(y)↑{p↑{d-1}}
\bigrp$.

To find a factor of $g(x)$, find a root $s(y)$ of $g(x)$, working in $F$, and
compute $h(x)$ by evaluating the above product. To find a root $s(y)$ of
$g(x)$ over $F$, note that $\gcd\biglp w(x),\,(x+t(y))↑{(p↑d-1)/2}-1\bigrp$
will be a nontrivial factor of $w(x)$ at least about half of the time, whenever
$w(x)$ has nothing but linear factors over $F$, as in exercise 14. Thus,
start with $w(x)=g(x)$ and find a nontrivial factor $f(x)$ over $F$; then
replace $w(x)$ by $f(x)$ or by $w(x)/f(x)$, whichever has smaller degree,
and repeat until $w(x)$ has degree 1.

[A similar procedure can be used when $p=2$: In this case $\gcd\biglp w(x),\,
T(t(y)x)\bigrp$ will be a nontrivial factor of $w(x)$ at least half of the time,
whenever $w(x)$ has nothing but linear factors over $F$, where $T(x)=
x+x↑2+x↑4+\cdots+x↑{2↑{d-1}}$. For exactly half of the elements $s(y)$ of
$F$ satisfy $T\biglp s(y)\bigrp=0$, and $T\biglp s↓1(y)t(y)\bigrp=
T\biglp s↓2(y)t(y)\bigrp$ iff $T\biglp(s↓1(y)-s↓2(y))t(y)\bigrp=0$.]

\def\+{\hbox{$\Psi$\hskip-1.5pt}}
\ansno 32. (a)\9 Clearly $x↑n-1=\prod↓{d\rslash n}\+↓d(x)$, since every complex
$n$th root of unity is a primitive $d$th root for some unique $d\rslash n$.
The second identity follows from the first; and $\+↓n(x)$ has integer
coefficients since it is expressed in terms of products and quotients of monic
polynomials with integer coefficients.\xskip(b) The condition in the hint
suffices to prove that $f(x)=\+↓n(x)$, so we shall take the hint. When
$p$ does not divide $n$, we have $\gcd(x↑n-1,nx↑{n-1})=1$ modulo $p$, hence
$x↑n-1$ is squarefree modulo $p$. Given $f(x)$ and $\zeta$ as in the hint,
let $g(x)$ be the irreducible factor of $\+↓n(x)$ such that $g(\zeta↑p)=0$.
If $g(x)≠f(x)$ then both $f(x)$ and $g(x)$ are distinct factors of $\+↓n(x)$,
hence they are distinct factors of $x↑n-1$, hence they have no irreducible
factors in common modulo $p$. However, $\zeta↑p$ is a root of $f(x↑p)$,
so $\gcd\biglp g(x),f(x↑p)\bigrp≠1$ over the integers, hence $g(x)$ is
a divisor of $f(x↑p)$. By (5), $g(x)$ is a divisor of $f(x)↑p$, modulo $p$,
contradicting the assumption that $f(x)$ and $g(x)$ have no irreducible
factors in common. Therefore $f(x)=g(x)$.\xskip[The irreducibility of
$\+↓n(x)$ was first proved for prime $n$ by K. F. Gauss in {\sl Disquisitiones
Arithmetic\ae\ }(Leipzig, 1801), Art.\ 341, and for general $n$ by L.
Kronecker, {\sl J. de Math. Pures et Appliqu\'ees \bf19} (1854), 177--192.]

(c)\9 $\+↓1(x)=x-1$; and when $p$ is prime, $\+↓p(x)=1+x+\cdots+x↑{p-1}$.
If $n>1$ is odd, it is not difficult to prove that $\+↓{2n}(x)=\+↓n(-x)$.
If $p$ divides $n$, the second identity in (a) shows that $\+↓{pn}(x)=
\+↓n(x↑p)$. If $p$ does not divide $n$, we have $\+↓{pn}(x)=\+↓n(x↑p)/
\+↓n(x)$. For nonprime $n≤15$ we have $\+↓4(x)=x↑2+1$, $\+↓6(x)=
x↑2-x+1$, $\+↓8(x)=x↑4+1$, $\+↓p(x)=x↑6+x↑3+1$, $\+↓{10}(x)=
x↑4-x↑3+x↑2-x+1$, $\+↓{14}(x)=x↑6-x↑5+x↑4-x↑3+x↑2-x+1$, $\+↓{15}(x)=
x↑8-x↑7+x↑5-x↑4+x↑3-x+1$.\xskip[The formula $\+↓{pq}(x)=(1+x↑p+\cdots+x↑{(q-1)p})
(x-1)/(x↑q-1)$ can be used to show that $\+↓{pq}(x)$ has all coefficients
$\pm1$ or 0 when $p$ and $q$ are prime; but the coefficients of $\+↓{pqr}(x)$
can be arbitrarily large.]

\ansno 33. The exact probability is $\prod↓{j≥1}(a↓{jp}/p↑j)↑{k↓j}/k↓j!$,
where $k↓j$ is the number of $d↓i$ equal to $j$. Since $a↓{jp}\approx 1/j$
by exercise 4, we get the formula of exercise 1.3.3--21.

{\sl Notes:} This exercise says that if we fix the prime $p$ and let the
polynomial $u(x)$ be random, it will have certain probability of splitting
in a given way modulo $p$. A much harder problem is to fix the polynomial $u(x)$
and to let $p$ be ``random''; but the same asymptotic result holds
for almost all $u(x)$:\xskip G. Frobenius proved in 1880 that the integer
polynomial $u(x)$ splits modulo $p$ into factors of degrees $d↓1$, $\ldotss$, $d↓r$,
when $p$ is a large prime chosen at random, with probability equal to the
number of permutations in the Galois group $G$ of $u(x)$ having cycle lengths
$\{d↓1,\ldotss,d↓k\}$ divided by the total number of permutations in $G$.\xskip
[If $u(x)$ has rational coefficients and distinct roots $\xi↓1$, $\ldotss$,
$\xi↓n$ over the complex numbers, its Galois group is the (unique) group $G$
of permutations such that the polynomial $\prod↓{p(1)\ldotsm p(n)\in G}
\biglp z+\xi↓{p(1)}y↓1+\cdots+\xi↓{p(n)}y↓n\bigrp=U(z,y↓1,\ldotss,y↓n)$ has rational
coefficients and is irreducible over the rationals.]\xskip Furthermore
B. L. van der Waerden proved in 1934 that almost all polynomials of degree $n$
have the set of all $n!$ permutations as their Galois group. Therefore
almost all fixed irreducible polynomials $u(x)$ will factor as we might expect them
to, with respect to randomly chosen large primes $p$.\xskip References:
{\sl Sitzungsberichte K\"onigl.\ Preu\ss.\ Akad.\ Wiss.\ }(Berlin, 1896),
\hbox{689--703}; {\sl Math.\ Annalen \bf109} (1934), 13--16.

\ansno 34. $d(x)=\prod↓{1≤i≤r}p↓i(x)↑{e↓i-\delta(e↓i)}$, where $\delta(e)=0$ if
$e\mod p=0$, otherwise $\delta(e)=1$. Let $d↓0(x)=u(x)$ and $d↓{k+1}(x)=\gcd\biglp
d↓k(x),d↓{\!k}↑\prime(x)\bigrp$, and consider the sequence $d↓0(x)$, $d↓1(x)$,
$\ldotss$; after $m=\max↓{1≤i≤r}(e↓i\mod p)$ steps we have $d↓m(x)=\prod↓{1≤i≤r}
p↓i(x)↑{p\lfloor e↓i/p\rfloor}$, and $d↓{m+1}(x)=d↓m(x)$. If $d↓m(x)≠1$, it can
be factored by applying (5). The factors of $u(x)/d↓m(x)$ can be written
$q↓1(x)q↓2(x)↑2\ldotss q↓m(x)↑m$ where the polynomials $q↓j(x)$ are squarefree and
relatively prime to each other: $q↓j(x)=\prod\leftset p↓i(x)\relv 1≤i≤r$ and $e↓i
\mod p=j\rightset$. We have $q↓j(x)=d↓{j-1}(x)/\biglp d↓j(x)q↓{j+1}(x)\ldotsm
q↓m(x)\bigrp$ for $j=m$, $m-1$, $\ldotss$, 1. The factorization of $u(x)$ is
completed by factoring $d↓m(x↑{1/p})$ and each $q↓j(x)$.

[Exercise 2(b) indicates that comparatively few polynomials are squarefree, but
$d(x)$ is actually $≠1$ quite frequently in practice; hence this method turns out
to be quite important. {\sl Reference:} D. Y. Y. Yun, {\sl Proc.\ MACSYMA User's
Conf.}, NASA publ.\ CP-2012 (1977), 65--70.]
%folio 819 galley 4b (C) Addison-Wesley 1978	*

\ansbegin{4.6.3}

\ansno 1. $2↑{λ(n)}$, the highest
power of 2 less than or equal to $n$.

\ansno 2. Assume that $x$ is input in register
A, and $n$ in location \.{NN}; the output is in register X.

{\yyskip\tabskip30pt \mixfive{\!
01⊗A1⊗ENTX⊗1⊗1⊗\understep{A1. Initialize.}\cr
02⊗⊗STX⊗Y⊗1⊗$Y←1$.\cr
03⊗⊗STA⊗Z⊗1⊗$Z←x$.\cr
\\04⊗⊗LDA⊗NN⊗1⊗$N←n$.\cr
05⊗⊗JMP⊗2F⊗1⊗To A2.\cr
\\06⊗5H⊗SRB⊗1⊗L+1-K\cr
07⊗⊗STA⊗N⊗L+1-K⊗$N←\lfloor N↓{\null}/2\rfloor$.\cr
\\08⊗A5⊗LDA⊗Z⊗L⊗\understep{A5. S\hskip2.5pt}{\sl\hskip-2.5pt
q}\understep{uare $Z$.}\cr
09⊗⊗MUL⊗Z⊗L⊗$Z\times Z\mod w$\cr
10⊗⊗STX⊗Z⊗L⊗\qquad$→Z$.\cr
\\11⊗A2⊗LDA⊗N⊗L⊗\understep{A2. Halve $N$.}\cr
12⊗2H⊗JAE⊗5B⊗L+1⊗To A5 if $N$ is even.\cr
13⊗⊗SRB⊗1⊗K\cr
14⊗⊗STA⊗N⊗K⊗$N←\lfloor N↓{\null}/2\rfloor$.\cr
\\15⊗A3⊗LDA⊗Z⊗K⊗\understep{A3. Multi}{\sl p\hskip-3pt}\understep{\hskip
3pt l\hskip1pt}{\sl\hskip-1pt y\hskip-2pt}\understep{\hskip2pt\ $Y$ b\hskip1pt
}{\sl\hskip-1pt y\hskip-2pt}\understep{\hskip2pt\ $Z$.}\cr
16⊗⊗MUL⊗Y⊗K⊗$Z\times Y\mod w$\cr
17⊗⊗STX⊗Y⊗K⊗\qquad$→Y$.\cr
\\18⊗A4⊗LDA⊗N⊗K⊗\understep{A4. $N=0$?}\cr
19⊗⊗JAP⊗A5⊗K⊗If $N>0$, continue at step A5.\quad\blackslug\cr}}

\yyskip\noindent $\biglp$It would be better programming
practice to change the instruction in line 05 to ``\.{JAP}'', followed
by an error indication. The running time is $21L + 17K + 13$,
where $L = λ(n)$ is one less than the number of bits in the
binary representation of $n$, and $K = \nu (n)$ is the number
of one bits in $n$'s representation. The running time could
be decreased by $K + 6$ units, by inserting step A4 before step
A3 and adding two new instructions to perform A3 when $N = 0.\bigrp$

For the serial program, we may assume that $n$
is small enough to fit in an index register; otherwise serial
exponentiation is out of the question. The following program
leaves the output in register A:

{\yyskip\tabskip30pt\mixfive{\!
01⊗S1⊗LD1⊗NN⊗1⊗$\rI1←n$.\cr
02⊗⊗STA⊗X⊗1⊗$X←x$.\cr
03⊗⊗JMP⊗2F⊗1\cr
\\04⊗1H⊗MUL⊗X⊗N-1⊗$\rA\times X\mod w$\cr
05⊗⊗SLAX⊗5⊗N-1⊗\qquad$→\rA$.\cr
\\06⊗2H⊗DEC1⊗1⊗N⊗$\rI1←\rI1-1$.\cr
07⊗⊗J1P⊗1B⊗N⊗Multiply again if $\rI1>0$.\quad\blackslug\cr}}

\yyskip\noindent The running time for this program is $14N-7$; it is faster than
the previous program when $n≤7$, slower when $n≥8$.


\ansno 3. The sequences of exponents are:\xskip (a) 1, 2, 3,
6, 7, 14, 15, 30, 60, 120, 121, 242, 243, 486, 487, 974, 975
[16 multiplications];\xskip (b) 1, 2, 3, 4, 8, 12, 24, 36, 72, 108,
216, 324, 325, 650, 975 [14 multiplications];\xskip (c) 1, 2, 3, 6,
12, 15, 30, 60, 120, 240, 243, 486, 972, 975 [13 multiplications];\xskip
(d) 1, 2, 3, 6, 13, 15, 30, 60, 75, 150, 225, 450, 900, 975
[13 multiplications].\xskip $\biglp$The smallest possible number of multiplications
is 12; this is obtainable by combining the factor method with
the binary method, since $975 =15 \cdot (2↑6 + 1).\bigrp$

\ansno 4. $(777777)↓8 = 2↑{18} - 1$.

\def\ansalgstep #1 #2. {\anskip\noindent\hbox to 40pt{\!
\hbox to 19pt{\hskip 0pt plus 1000pt minus 1000pt\bf#1 }\hskip 0pt plus 1000pt
\bf#2. }\hangindent 40pt}
\ansalgstep 5. T1. [Initialize.]\xskip Set $\.{LINKU}[j]
← 0$ for $1 ≤ j ≤ 2↑r$, and set $k ← 0$, $\.{LINKR}[0] ← 1$, $\.{LINKR}[1]
← 0$.

\ansalgstep{} T2. [Change level.]\xskip(Now level $k$
of the tree has been linked together from left to right, starting
at $\.{LINKR}[0]$.) If $k = r$, the algorithm terminates. Otherwise
set $n ← \.{LINKR}[0]$, $m ← 0$.

\ansalgstep{} T3. [Prepare for $n$.]\xskip(Now $n$
is a node on level $k$, and $m$ points to the rightmost node
currently on level $k + 1$.)\xskip Set $q ← 0$, $s ← n$.

\ansalgstep{} T4. [Already in tree?]\xskip(Now $s$ is
a node in the path from the root to $n$.)\xskip If $\.{LINKU}[n + s]
≠ 0$, go to T6 (the value $n + s$ is already in the tree).

\ansalgstep{} T5. [Insert below $n$.]\xskip If $q
= 0$, set $m↑\prime ← n + s$. Set $\.{LINKR}[n + s] ← q$, $\.{LINKU}[n
+ s] ← n$, $q ← n + s$.

\ansalgstep{} T6. [Move up.]\xskip Set $s ←\.{LINKU}[s]$.
If $s ≠ 0$, return to T4.

\ansalgstep{} T7. [Attach group.]\xskip If $q ≠ 0$, set
$\.{LINKR}[m] ← q$, $m ← m↑\prime $.

\ansalgstep{} T8. [Move $n$.]\xskip Set $n ←\.{LINKR}[n]$.
If $n ≠ 0$, return to T3.

\ansalgstep{} T9. [End of level.]\xskip Set $\.{LINKR}[m]
← 0$, $k ← k + 1$, and return to T2.\quad\blackslug

\ansno 6. Prove by induction that
the path to the number $2↑{e↓0} + 2↑{e↓1} + \cdots
+ 2↑{e↓t}$, if $e↓0 > e↓1 > \cdots >e↓t ≥ 0$, is 1, 2,
$2↑2$, $\ldotss$, $2↑{e↓0}$, $2↑{e↓0} + 2↑{e↓1}$, $\ldotss$,
$2↑{e↓0}+2↑{e↓1}+ \cdots + 2↑{e↓t}$; furthermore,
the sequences of exponents on each level are in decreasing lexicographic
order.

\ansno 7. The binary and factor methods require
one more step to compute $x↑{2n}$ than $x↑n$; the power tree
method requires at most one more step. Hence (a) $15 \cdot 2↑k$;
(b) $33 \cdot 2↑k$; (c) $23 \cdot 2↑k$; $k = 0$, 1, 2, 3,
$\ldotss\,$.

\ansno 8. The power tree always includes the node
$2m$ at one level below $m$, unless it occurs at the same level
or an earlier level; and it always includes the node $2m + 1$
at one level below $2m$, unless it occurs at the same level.\xskip
(Computational experiments have shown that $2m$ is below $m$
for all $m ≤ 2000$, but it appears very difficult to prove this
in general.)

\ansno 10. By using the ``\.{FATHER}'' representation discussed
in Section 2.3.3: Make use of a table $f[j]$, $1 ≤ j ≤ 100$,
such that $f[1] = 0$ and $f[j]$ is the number of the node
just above $j$ for $j ≥ 2$.\xskip (The fact that each node of this
tree has degree at most two has no effect on the efficiency
of this representation; it just makes the tree look prettier
as an illustration.)

\ansno 11. 1, 2, 3, 5, 10, 20, (23 or 40), 43; 1, 2, 4, 8, 9,
17, (26 or 34), 43; 1, 2, 4, 8, 9, 17, 34, (43 or 68), 77; 1,
2, 4, 5, 9, 18, 36, (41 or 72), 77. If either of the latter
two paths were in the tree we would have no possibility for
$n = 43$, since the tree must contain either 1, 2, 3, 5 or 1,
2, 4, 8, 9.

\ansno 12. No such infinite tree can exist, since $l(n) ≠ l↑*(n)$
for some $n$.
%folio 821 galley 5 (C) Addison-Wesley 1978	*
\ansno 13. For Case 1, use a Type-1 chain followed by $2↑{A+C}
+ 2↑{B+C} + 2↑A + 2↑B$; or use the factor method. For Case 2,
use a Type-2 chain followed by $2↑{A+C+1} + 2↑{B+C} + 2↑A +
2↑B$. For Case 3, use a Type-5 chain followed by addition of
$2↑A + 2↑{A-1}$, or use the factor method. For Case 4, $n =
135 \cdot 2↑D$, so we may use the factor method.

\ansno 14. (a)\9 It is easy to verify that steps $r - 1$ and $r
- 2$ are not both small, so let us assume that step $r - 1$
is small and step $r - 2$ is not. If $c = 1$, then $λ(a↓{r-1})
= λ(a↓{r-k})$, so $k = 2$; and since $4 ≤ \nu (a↓r) = \nu (a↓{r-1})
+ \nu (a↓{r-k}) - 1 ≤ \nu (a↓{r-1}) + 1$, we have $\nu (a↓{r-1})
≥ 3$, making $r - 1$ a star step (lest $a↓0$, $a↓1$, $\ldotss$, $a↓{r-3}$,
$a↓{r-1}$ include only one small step). Then $a↓{r-1} = a↓{r-2}
+ a↓{r-q}$ for some $q$, and if we replace $a↓{r-2}$, $a↓{r-1}$,
$a↓r$ by $a↓{r-2}$, $2a↓{r-2}$, $2a↓{r-2} + a↓{r-q} = a↓r$, we obtain
another counterexample chain in which step $r$ is small; but
this is impossible. On the other hand, if $c ≥ 2$, then $4 ≤
\nu(a↓r) ≤ \nu (a↓{r-1}) + \nu (a↓{r-k}) - 2 ≤ \nu (a↓{r-1})$;
hence $\nu (a↓{r-1}) = 4$, $\nu (a↓{r-k}) = 2$, and $c = 2$. This
leads readily to an impossible situation by a consideration
of the six types in the proof of Theorem B.

(b)\9 If $λ(a↓{r-k}) < m - 1$, we have $c ≥ 3$, so
$\nu (a↓{r-k}) + \nu (a↓{r-1}) ≥ 7$ by (22); therefore both
$\nu (a↓{r-k})$ and $\nu (a↓{r-1})$ are $≥3$. All small steps
must be $≤r - k$, and $λ(a↓{r-k}) = m - k + 1$. If $k ≥ 4$,
we must have $c = 4$, $k = 4$, $\nu (a↓{r-1}) = \nu (a↓{r-4}) =
4$; thus $a↓{r-1} ≥ 2↑m + 2↑{m-1} + 2↑{m-2}$, and $a↓{r-1}$
must equal $2↑m + 2↑{m-1} + 2↑{m-2} + 2↑{m-3}$; but $a↓{r-4}
≥ {1\over 8}a↓{r-1}$ now implies that $a↓{r-1} = 8a↓{r-4}$.
Thus $k = 3$ and $a↓{r-1} > 2↑m + 2↑{m-1}$. Since $a↓{r-2} <
2↑m$ and $a↑{r-3} < 2↑{m-1}$, step $r - 1$ must be a doubling;
but step $r - 2$ is a nondoubling, since $a↓{r-1} ≠ 4a↓{r-3}$.
Furthermore, since $\nu (a↓{r-3}) ≥ 3$, $r - 3$ is a star step;
and $a↓{r-2} = a↓{r-3} + a↓{r-5}$ would imply that $a↓{r-5}
= 2↑{m-2}$, hence we must have $a↓{r-2} = a↓{r-3} + a↓{r-4}$.
As in a similar case treated in the text, the only possibility
is now seen to be $a↓{r-4} = 2↑{m-2} + 2↑{m-3}$, $a↓{r-3} = 2↑{m-2}
+ 2↑{m-3} + 2↑{d+1} + 2↑d$, $a↓{r-1} = 2↑m + 2↑{m-1} + 2↑{d+2}
+ 2↑{d+1}$, and even this possibility is impossible.

\ansno 16. $l↑B(n) = λ(n) + \nu (n) - 1$; so if $n = 2↑k$,
$l↑B(n)/λ(n) = 1$, but if $n = 2↑{k+1} - 1$, $l↑B(n)/λ(n) =
2$.

\ansno 17. Let $i↓1 < \cdots < i↓t$. Delete any intervals $I↓k$
that can be removed without affecting the union $I↓1 ∪ \cdots
∪ I↓t$.\xskip (The interval $(j↓k, i↓k\,]$ may be dropped out if either
$j↓{k+1} ≤ j↓k$ or $j↓1 < j↓2 < \cdots$ and $j↓{k+1}
≤ i↓{k-1}$.)\xskip Now combine overlapping intervals $(j↓1, i↓1]$,
$\ldotss$, $(j↓d, i↓d]$ into an interval $(j↑\prime , i↑\prime
] = (j↓1, i↓d]$ and note that
$$a↓{i↑\prime}<a↓{j↑\prime}(1 + \delta )↑{i↓1-j↓1+\cdots+i↓d-j↓d}
≤ a↓{j↑\prime}(1 + \delta )↑{2(i↑\prime-j↑\prime)},$$
since each point of $(j↑\prime , i↑\prime ]$ is
covered at most twice in $(j↓i, i↓1] ∪ \cdots ∪ (j↓d, i↓d]$.

\ansno 18. Call $f(m)$ a ``nice'' function if
$\biglp\log f(m)\bigrp /m → 0$ as $m→∞$. A
polynomial in $m$ is nice. The product of nice functions is
nice. If $g(m) → 0$ and $c$ is a positive constant, then $c↑{mg(m)}$
is nice; also ${2m\choose mg(m)}$ is nice, for by Stirling's
approximation this is equivalent to saying that $g(m)\log\biglp
1/g(m)\bigrp → 0$.

Now replace each term of the summation
by the maximum term that is attained for any $s$, $t$, $v$. The
total number of terms is nice, and so are ${m+s\choose t+v}$,
${t+v\choose v} ≤ 2↑{t+v}$, and $β↑{2v}$, because $(t +
v)/m → 0$. Finally, ${(m+s)↑2\choose t} ≤ (2m)↑{2t}/t!
< (4m↑2/t)↑te↑t$, where $(4e)↑t$ is nice; setting $t$ to its
maximum value $(1 - {1\over 2}ε)m/λ(m)$, we have the upper bound $(m↑2/t)↑t
= \biglp mλ(m)/(1 - {1\over 2}ε)\bigrp ↑t = 2↑{m(1-ε/2)}
\cdot f(m)$, where $f(m)$ is nice. Hence the entire sum is less
than $α↑m$ for large $m$, if $α = 2↑{1-\eta}$, $0 < \eta < {1\over
2}ε$.

\ansno 19. (a)\9 $M ∩ N$, $M ∪ N$, $M \uplus N$, respectively;
see Eqs.\ 4.5.2--6, 4.5.2--7.

(b)\9 $f(z)g(z)$,\xskip$\lcm\biglp f(z), g(z)\bigrp $,\xskip
$\gcd\biglp f(z), g(z)\bigrp $.\xskip (For the same reasons as (a),
because the monic irreducible polynomials over the complex numbers
are precisely the polynomials $z - \zeta$.)

(c)\9 Commutative laws $A \uplus B = B \uplus A$, $A ∪ B =
B ∪ A$, $A ∩ B = B ∩ A$. Associative laws $A \uplus (B \uplus C) = (A
\uplus B) \uplus C$, $A ∪ (B ∪ C) = (A ∪ B) ∪ C$, $A ∩ (B ∩ C) = (A ∩ B)
∩ C$. Distributive laws $A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)$, $A
∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)$, $A \uplus (B ∪ C) = (A \uplus B) ∪ (A
\uplus C)$, $A \uplus (B ∩ C) = (A \uplus B) ∩ (A \uplus C)$. Idempotent laws
$A ∪ A = A$, $A∩A = A$. Absorption laws $A ∪ (A ∩ B) = A$, $A ∩ (A
∪ B) = A$, $A ∩ (A \uplus B) = A$, $A ∪ (A \uplus B) = A \uplus B$. Identity
and zero laws $\emptyset \uplus A = A$, $\emptyset ∪A=A$, $\emptyset∩A=\emptyset$, 
where $\emptyset$ is the empty
multiset. Counting law $A \uplus B = (A ∪ B) \uplus (A ∩ B)$. Further
properties analogous to those of sets come from the partial
ordering defined by the rule $A \subset B$ iff $A ∩ B = A$ (iff
$A ∪ B = B$).

{\sl Notes:} Other common applications of multisets
are zeros and poles of meromorphic functions, invariants of
matrices in canonical form, invariants of finite Abelian groups,
etc.; multisets can be useful in combinatorial counting arguments
and in the development of measure theory. The terminal strings
of a noncircular context-free grammar form a multiset that
is a set if and only if the grammar is unambiguous. Although
multisets appear frequently in mathematics, they often must
be treated rather clumsily because there is currently no standard
way to treat sets with repeated elements. Several mathematicians
have voiced their belief that the lack of adequate terminology
and notation for this common concept has been a definite handicap
to the development of mathematics.\xskip(A multiset is, of course,
formally equivalent to a mapping from a set into the nonnegative
integers, but this formal equivalence is of little or no practical
value for creative mathematical reasoning.)\xskip The author has discussed
this matter with many people in an attempt to find a good remedy.
Some of the names suggested for the concept were list, bunch, bag,
heap, sample, weighted set, collection; but these words either
conflict with present terminology, have an improper connotation,
or are too much of a mouthful to say and to write conveniently.
It does not seem out of place to coin a new word for such an
important concept, and ``multiset'' has been suggested by N.
G. de Bruijn. The notation ``$A \uplus B$'' has been selected by
the author to avoid conflict with existing notations and to
stress the analogy with set union. It would not be as desirable
to use ``$A + B$'' for this purpose, since algebraists have
found that $A + B$ is a good notation for $\leftset α + β\relv α \in A\hbox{ and }
β \in B\rightset$. If $A$ is a multiset of nonnegative integers,
let $G(z) = \sum ↓{n\in A} z↑n$ be a generating function corresponding
to $A$.\xskip (Generating functions with nonnegative integer coefficients
obviously correspond one-to-one with multisets of nonnegative
integers.) If $G(z)$ corresponds to $A$ and $H(z)$ to $B$, then
$G(z) + H(z)$ corresponds to $A \uplus B$ and $G(z)H(z)$ corresponds
to $A + B$. If we form ``Dirichlet'' generating functions $g(z)
= \sum ↓{n\in A} 1/n↑z$, $h(z) = \sum ↓{n\in B} 1/n↑z$, the product
$g(z)h(z)$ corresponds to the multiset product $AB$.

\ansno 20. Type 3: $(S↓0, \ldotss , S↓r) = (M↓{00}, \ldotss ,
M↓{r0}) = (\{0\}$, $\ldotss$, $\{A\}$, $\{A - 1, A\}$, $\{A - 1, A,
A\}$, $\{A - 1, A - 1, A, A, A\}$, $\ldotss$, $\{A + C - 3, A + C-3,A+C
- 2, A + C - 2, A + C - 2\})$.\xskip Type 5: $(M↓{00}, \ldotss , M↓{r0})
= (\{0\}$, $\ldotss$, $\{A\}$, $\{A - 1, A\}$, $\ldotss$, $\{A + C - 1,
A + C\}$, $\{A + C - 1, A + C - 1, A + C\}$, $\ldotss$, $\{A + C +
D - 1, A + C + D - 1, A + C + D\})$; $(M↓{01}, \ldotss , M↓{r1})
=(\emptyset$, $\ldotss$,$\emptyset$, $\ldotss$, $\emptyset$, $\{A+C-2\}$, $\ldotss$,
$\{A + C + D - 2\})$, $S↓i = M↓{i0} \uplus M↓{i1}$.

\ansno 21. For example, let $u = 2↑{8q+5}$, $x = (2↑{(q+1)u} -
1)/(2↑u - 1) = 2↑{qu} + \cdots + 2↑u + 1$, $y = 2↑{(q+1)u} + 1$.
Then $xy = (2↑{2(q+1)u} - 1)/(2↑u - 1)$. If $n = 2↑{4(q+1)u}
+ xy$, we have $l(n) ≤ 4(q + 1)u + q + 2$ by Theorem F\null, but
$l↑*(n) = 4(q + 1)u + 2q + 2$ by Theorem H.

\ansno 22. Underline everything except the $u - 1$ insertions
used in the calculation of $x$.

\ansno 23. Theorem G (everything underlined).

\ansno 24. Use the numbers $(B↑{a↓i} - 1)/(B - 1)$, $0 ≤
i ≤ r$, underlined when $a↓i$ is underlined; and $c↓kB↑{i-1}(B↑{b↓j}
- 1)/(B - 1)$ for $0 ≤ j < t$, $0 < i ≤ b↓{j+1}-b↓j$, $1≤k ≤ l↑0(B)$, underlined
when $c↓k$ is underlined, where $c↓0$, $c↓1$, $\ldots$ is a minimum
length $l↑0$-chain for $B$. To prove the second inequality,
let $B = 2↑m$ and use (3).\xskip (The second inequality is rarely,
if ever, an improvement on Theorem G.)
%folio 824 galley 6a (C) Addison-Wesley 1978	*
\ansno 25. We may assume that $d↓k = 1$. Use the rule  R $A↓{k-1}
\ldotsm A↓1$, where $A↓j = \null$``XR'' if ${d↓j = 1}$, $A↓j = \null$``R''
otherwise, and where ``R'' means take the square root, ``X''
means multiply by $x$. For example, if $y = (.1101101)↓2$, the
rule is R R XR XR R XR XR.\xskip (There exist binary square-root extraction
algorithms suitable for computer hardware, requiring an execution
time comparable to that of division; computers with such hardware
could therefore calculate more general fractional powers using the
technique in this exercise.)

\ansno 26. If we know the pair $(F↓k, F↓{k-1})$, then $(F↓{k+1},
F↓k) = (F↓k + F↓{k-1}, F↓k)$ and $(F↓{2k}, F↓{2k-1}) = (F↑{2}↓{k}
+ 2F↓kF↓{k-1}, F↑{2}↓{k} + F↑{2}↓{k-1})$; so a binary method
can be used to calculate $(F↓n, F↓{n-1})$, using $O(\log n)$
arithmetic operations. Perhaps better is to use the pair of
values $(F↓k, L↓k)$, where $L↓k = F↓{k-1} + F↓{k+1}$ (cf.\ Section
4.5.4); then $(F↓{k+1}, L↓{k+1}) = \biglp {1\over 2}(F↓k + L↓k),
{1\over 2}(5F↓k + L↓k)\bigrp$, $(F↓{2k}, L↓{2k}) = \biglp F↓kL↓k, L↑{2}↓{k}
- 2(-1)↑k\bigrp $.

For the general linear recurrence $x↓n = a↓1x↓{n-1}
+ \cdots + a↓dx↓{n-d}$, we can compute $x↓n$ in $O(d↑3\log
n)$ arithmetic operations by computing the $n$th power of an
appropriate $d\times d$ matrix.\xskip $\biglp$This observation is due to J. C.
P. Miller and D. J. S. Brown, {\sl Comp.\ J.} {\bf 9} (1966),
188--190.$\bigrp$

\ansno 27. First form the $2↑m - m - 1$ products $x↑{e↓1}↓{1}
\ldotss x↑{e↓m}↓{m}$, where $0 ≤ e↓j ≤ 1$ and $e↓1 + \cdots +
e↓m ≥ 2$. Then if $n↓j = (d↓{jλ} \ldotsm d↓{j1}d↓{j0})↓2$,
the sequence begins with $x↑{d↓{1λ}}↓{1} \ldotsm x↑{d↓{mλ}}↓{m}$
and then we square and multiply by $x↑{d↓{1i}}↓{1} \ldotss x↑{d↓{mi}}↓{m}$,
for $i = λ - 1$, $\ldotss$, 1, 0.\xskip$\biglp$Straus [{\sl AMM \bf 71}
(1964), 807--808] has shown that $2λ(n)$ may be replaced by
$(1 + ε)λ(n)$ for any $ε > 0$, by generalizing this binary method
to $2↑k$-ary as in Theorem D\null. At least $l(n↓1 + \cdots + n↓m)$
multiplications are obviously required. See N. Pippenger, {\sl Proc.\
IEEE Symp.\ Foundations of Comp.\ Sci.\ \bf17} (1976), 258--263, for extensive
generalizations.$\bigrp$

\def\\{\mathbin{\char'562}}
\ansno 28. (a)\9 $x \\ y = x ∨ y ∨ (x + y)$, where
``$∨$'' is logical ``or'', cf.\ exercise 4.6.2--26; clearly $\nu
(x \\ y) ≤ \nu (x ∨ y) + \nu (x ∧ y) = \nu (x) + \nu (y)$.\xskip (b)
Note first that $A↓{i-1}/2↑{d↓{i-1}}\subset
A↓i/2↑{d↓i}$ for $1 ≤ i ≤ r$. Secondly, note that $d↓j = d↓{i-1}$ in
a nondoubling; for otherwise $a↓{i-1} ≥ 2a↓j ≥ a↓j + a↓k = a↓i$.
Hence $A↓j \subset A↓{i-1}$ and $A↓k \subset A↓{i-1}/2↑{d↓j-d↓k}
$.\xskip (c) An easy induction on $i$, except that
close steps need closer attention. Let us say that $m$ has property
$P(α)$ if the 1's in its binary representation all appear in
consecutive blocks of $≥α$ in a row. If $m$ and $m↑\prime$ have
$P(α)$, so does $m \\ m↑\prime $; if $m$ has $P(α)$ then $\rho
(m)$ has $P(α + \delta )$. Hence $B↓i$ has $P(1 + \delta c↓i)$.
Finally if $m$ has $P(α)$ then $\nu \biglp \rho (m)\bigrp ≤ (α +
\delta )\nu (m)/α$; for $\nu (m) = \nu ↓1 + \cdots + \nu ↓q$,
where each block size $\nu ↓j$ is $≥α$, hence $\nu \biglp \rho
(m)\bigrp ≤ (\nu ↓1 + \delta ) + \cdots + (\nu ↓q + \delta) ≤ (1
+ \delta /α)\nu ↓1 + \cdots + (1 + \delta /α)\nu ↓q$.\xskip (d) Let
$f = b↓r + c↓r$ be the number of nondoublings and $s$ the number
of small steps. If $f ≥ 3.271 \lg \nu (n)$ we have $s ≥ \lg
\nu (n)$ as desired, by (16). Otherwise we have $a↓i ≤ (1 +
2↑{-\delta})↑{b↓i}2↑{c↓i+d↓i}$ for $0 ≤ i ≤
r$, hence $n ≤ \biglp (1 + 2↑{-\delta})/2\bigrp ↑{b↓r}2↑r$,
and $r ≥ \lg n + b↓r - b↓r \lg(1 + 2↑{-\delta}) ≥ \lg n + \lg
\nu (n) - \lg(1 + \delta c↓r) - b↓r \lg(1 + 2↑{-\delta})$. Let
$\delta = \lceil \lg(f + 1)\rceil $; then $\ln(1 + 2↑{-\delta})
≤ \ln\biglp 1 + 1/(f + 1)\bigrp ≤ 1/(f + 1) ≤ \delta /(1 +
\delta f)$, and it follows that $\lg(1 + \delta x) + (f - x)\lg(1
+ 2↑{-\delta}) ≤ \lg(1 + \delta f)$ for $0 ≤ x ≤ f$. Hence finally
$l(n) ≥ \lg n + \lg \nu(n) - \lg\biglp 1 + (3.271\lg \nu
(n)) \lceil \lg(1 + 3.271 \lg \nu (n))\rceil\bigrp$.\xskip
[{\sl Theoretical Comp.\ Sci.\ \bf 1} (1975),
1--12.]

\ansno 29. In the paper just cited, Sch\"onhage
refined the method of exercise 28 to prove that $l(n)
≥ \lg n + \lg \nu (n) - 2.13$ for all $n$. Can the remaining
gap be closed?

\ansno 30. $n = 31$ is the smallest example;
$l(31) = 7$, but 1, 2, 4, 8, 16, 32, 31 is an addition-subtraction
chain of length 6.\xskip
$\biglp$After proving Thorem E\null, Erd\H os stated that the
same result holds
also for addition-subtraction chains. Sch\"onhage has
extended the lower bound of exercise 28 to addition-subtraction
chains, with $\nu (n)$ replaced by $\=\nu (n) =\null$minimum
number of nonzero digits to represent $n = (n↓q \ldotsm n↓0)↓2$
where each $n↓j$ is $-1$, 0, or $+1$. This quantity $\=\nu(n)$ is the number
of 1's, in the ordinary binary representation of $n$, that
are immediately preceded by 0 or by the string $00(10)↑k1$ for
some $k ≥ 0$.$\bigrp$

\ansno 32. First compute $2↑i$ for $1≤i≤λ(n↓m)$, then compute each $n=n↓j$
by the following variant of the $2↑k$-ary method: For all odd $i<2↑k$, compute
$f↓i=\sum\leftset 2↑{kt+e}\relv d↓t=2↑ei\rightset$ where $n=(\ldotsm d↓1d↓0)↓{2↑k}$,
in at most $\lfloor{1\over k}\lg n\rfloor$ steps; then compute $n=\sum if↓i$ in
at most $\sum l(i)+2↑{k-1}$ further steps. The number of steps per $n↓j$ is
$≤\lfloor{1\over k}\lg n\rfloor+O(k2↑k)$, and this is $λ(n)/λλ(n)+O\biglp
λ(n)λλλ(n)/λλ(n)↑2\bigrp$ when $k=\lfloor\lg\lg n-3\lg\lg\lg n\rfloor$.

$\biglp$A generalization of Theorem E gives the corresponding lower bound.\xskip
Reference: {\sl SIAM J. Computing \bf5} (1976), 100--103. See also the Pippenger
paper cited in connection with exercise 27.$\bigrp$

\ansno 33. The following construction due to D. J. Newman provides the best
upper bound currently known: Let $k=p↓1\ldotsm p↓r$ be the product of the first
$r$ primes. Compute $k$ and all quadratic residues mod $k$ by the method of
exercise 32, in $O(k\log k/2↑r)$ steps (because there are approximately $k/2↑r$
quadratic residues). Also compute all multiples of $k$ that are $≤m↑2$, in about
$m↑2/k$ further steps. Now $m$ additions suffice to compute $1↑2$, $2↑2$,
$\ldotss$, $m↑2$. We have 
$k=\exp\biglp p↓r+O\biglp p↓r/(\log p↓r)↑{1000}\bigrp\bigrp=
\exp\biglp r\ln r+r\ln\ln r-r+r\ln\ln r/\!\ln r-2r/\!\ln r+O\biglp r(\ln\ln r/
\!\ln r)↑2\bigrp\bigrp$; so by choosing 
$$\textstyle r=\hbox{\:u\char'142}\biglp1+(1+
{1\over2}\ln2)/\!\lg\lg m\bigrp\ln m/\!\ln\ln m\hbox{\:u\char'143}$$ it follows
that $l(1↑2,\ldotss,m↑2)=m+O\biglp m\cdot\exp(-{1\over2}\ln2\ln m/\!\ln\ln m)
\bigrp$.

On the other hand, D. Dobkin and R. Lipton have shown that, for any $ε>0$,
$l(1↑2,\ldotss,m↑2)>m+m↑{2/3-ε}$ when $m$ is sufficiently large [{\sl Conf.\
on Theoretical Comp.\ Sci.}, Univ.\ Waterloo (1977), 146--148].

\ansno 35. See {\sl Discrete Math.\ \bf23} (1978), 115--119.
%folio 826 galley 6b (C) Addison-Wesley 1978	*
\ansbegin{4.6.4}

\ansno 1. Set $y ← x↑2$, then compute
$\biglp (\ldotsm(u↓{2n+1}y + u↓{2n-1})y + \cdotss)y+u↓1\bigrp x$.

\ansno 2. Replacing $x$ in (2) by the polynomial $x + x↓0$ leads
to the following procedure:

\yyskip\hangindent 40pt\noindent\hbox to 40pt{\hfill\bf G1. }
Do step G2 for $k = n$, $n - 1$, $\ldotss$,
0 (in this order), and stop.

\yskip\hangindent 40pt\noindent\hbox to 40pt{\hfill\bf G2. }
Set $v↓k ← u↓k$, and then set
$v↓j ← v↓j + x↓0v↓{j+1}$ for $j = k$, $k + 1$, $\ldotss$, $n - 1$.\xskip
(When $k = n$, this step simply sets $v↓n ← u↓n$.)\quad\blackslug

\yyskip\noindent The computations turn out to be
identical to those in H1 and H2, but performed in a different order.
$\biglp$This
application was, in fact, Newton's original motivation for using (2).$\bigrp$

\ansno 3. The coefficient of $x↑k$ is a polynomial in $y$ that
may be evaluated by Horner's rule: $\biglp \ldotsm(u↓{n,0}x
+ (u↓{n-1,1}y + u↓{n-1,0}))x + \cdotss\bigrp x + \biglp (\ldotsm
(u↓{0,n}y + u↓{0,n-1})y + \cdotss)y + u↓{0,0}\bigrp $.\xskip $\biglp$For
a ``homogeneous'' polynomial, such as $u↓nx↑n + u↓{n-1}x↑{n-1}y
+ \cdots + u↓1xy↑{n-1} + u↓0y↑n$, another scheme is more efficient:
first divide $x$ by $y$, evaluate a polynomial in $x/y$, then
multiply by $y↑n$.$\bigrp$

\ansno 4. Rule (2) involves $4n$ or $3n$ real multiplications
and $4n$ or $7n$ real additions; (3) is worse, it takes 4$n
+ 2$ or $4n + 1$ mults, $4n + 2$ or $4n + 5$ adds.

\ansno 5. One multiplication to compute $x↑2$; $\lfloor
n/2\rfloor$ multiplications and $\lfloor n/2\rfloor$ additions
to evaluate the first line; $\lceil n/2\rceil$ multiplications
and $\lceil n/2\rceil - 1$ additions to evaluate the second
line; and one addition to add the two lines together. Total:
$n + 1$ multiplications and $n$ additions.

\ansno 6. \hbox to 21pt{\hfill\bf J1. }Compute and store the values $x↑{2}↓{0}$,
$\ldotss$, $x↑{\lceil n/2\rceil}↓{0}$.

\yskip\noindent\hbox to 40pt{\hfill\bf J2. }Set $v↓j ← u↓jx↑{j-\lfloor n/2\rfloor
}↓{0}$ for $0 ≤ j ≤ n$.

\yskip\noindent\hbox to 40pt{\hfill\bf J3. }For $k = 0$, 1, $\ldotss$, $n - 1$,
set $v↓j ← v↓j + v↓{j+1}$ for $j = n - 1$, $\ldotss$, $k + 1$, $k$.

\yskip\noindent\hbox to 40pt{\hfill\bf J4. }Set $v↓j ← v↓jx↑{\lfloor n/2\rfloor
-j}↓{0}$ for $0 ≤ j ≤ n$.\quad\blackslug

\yyskip\noindent There are $(n + n↑2)/2$ additions, $n + \lceil
n/2\rceil - 1$ multiplications, $n$ divisions. Another multiplication
and division can be saved by treating $v↓n$ and $v↓0$ as special
cases.\xskip {\sl Reference:  SIGACT News \bf 7}, 3 (Summer 1975),
32--34.

\ansno 7. Let $x↓j = x↓0 + jh$, and consider (42), (44). Set
$y↓j ← u(x↓j)$ for $0 ≤ j ≤ n$. For $k = 1$, 2, $\ldotss$, $n$ (in
this order), set $y↓j ← y↓j - y↓{j-1}$ for $j = k$, $k + 1$, $\ldotss
$, $n$ (in this order). Now $β↓j = y↓j$ for all $j$.

\ansno 8. See (43).

\ansno 9. [{\sl Combinatorial Mathematics} (Buffalo: Math.\ Assoc.\ of
America, 1963), 26--28.]\xskip This formula can be regarded as
an application of the principle of inclusion and exclusion (Section
1.3.3), since the sum of the terms for $n - ε↓1 - \cdots - ε↓n
= k$ is the sum of all $x↓{1j↓1}x↓{2j↓2}\ldotsm
x↓{nj↓n}$ for which $k$ values of the $j↓i$ do not appear.
A direct proof can be given by observing that the coefficient
of $x↓{1j↓1}\ldotsm x↓{nj↓n}$ is
$$\sum (-1)↑{n-ε↓1-\cdots-ε↓n\,}ε↓{j↓1}\ldotsm ε↓{j↓n};$$
if the $j$'s are distinct, this equals unity, but
if $j↓1$, $\ldotss$, $j↓n ≠ k$ then it is zero, since the terms
for $ε↓k = 0$ cancel the terms for $ε↓k = 1$.

To evaluate the sum efficiently, we can start
with $ε↓1 = 1$, $ε↓2 = \cdots = ε↓n = 0$, and we can then proceed
through all combinations of the $ε$'s in such a way that only
one $ε$ changes from one term to the next.\xskip(See ``Gray code''
in Chapter 7.)\xskip The work to compute the first term is $n - 1$
multiplications; the subsequent $2↑n - 2$ terms each involve
$n$ additions, then $n - 1$ multiplications, then one more addition.
Total: $(2↑n - 1)(n - 1)$ multiplications, and $(2↑n - 2)(n
+ 1)$ additions. Only $n + 1$ temp storage locations are needed,
one for the main partial sum and one for each factor of the
current product.
%folio 828 galley 7 (C) Addison-Wesley 1978	*
\ansno 10. $\sum ↓{1≤k<n} (k + 1){n\choose k+1} = n(2↑{n-1}
- 1)$ multiplications and $\sum ↓{1≤k<n} k{n\choose k+1} =
{n2↑{n-1} - 2↑n} + 1$ additions. This is approximately half as
many arithmetic operations as the method of exercise 9, although
it requires a more complicated program to control the sequence.
Approximately ${n\choose \lceil n/2\rceil } + {n\choose \lceil
n/2\rceil -1}$ temporary storage locations must be used, and
this grows exponentially large (on the order of $2↑n/\sqrt{n}\,$).

The method in this exercise is equivalent to the
unusual matrix factorization of the permanent function given
by Jurkat and Ryser in {\sl J. Algebra \bf 5} (1967), 342--357.
It may also be regarded as an application of (39) and (40), in
an appropriate sense.

\ansno 12. $\biglp$Here is a brief summary of progress on this famous research
problem: J. Hopcroft and L. R. Kerr proved, among other
things, that 7 multiplications are necessary in $2 \times 2$ matrix
multiplication [{\sl SIAM J. Appl.\ Math.\ \bf 20} (1971), 30--36].
R. L. Probert showed that all 7-multiplication schemes, in
which each multiplication takes a linear combination of elements
from one matrix and multiplies by a linear combination of elements
from the other, must have at least 15 additions [{\sl SIAM J.
Computing \bf5} (1976), 187--203]. For $n=3$, the best method known is due to
J. D. Laderman [{\sl Bull.\ Amer.\ Math.\ Soc.\ \bf 82}
(1976), 126--128], who showed that 23 noncommutative multiplications
suffice. His construction has been generalized by Ondrej S\'ykora, who exhibited a
method requiring $n↑3-(n-1)↑2$ noncommutative multiplications and
$n↑3-n↑2+11(n-1)↑2$ additions, a result that also reduces to (36) when $n=2$
[{\sl Lecture Notes in Comp.\ Sci.\ \bf53} (1977), 504--512].
The best lower bound known to hold for all $n$ is the fact that
$2n↑2-1$ nonscalar multiplications are necessary [Jean-Paul Lafon and S. Winograd,
{\sl Theoretical Comp.\ Sci.}, to appear]. The best upper bounds known for large
$n$ come from Pan's constructions, cf.\ exercise 59.$\bigrp$

\ansno 13. By summing geometric series, we find that $F(t↓1, \ldotss , t↓n)$ equals
$$\textstyle\sum ↓{0≤s↓1<m↓1,\ldots,0≤s↓n<m↓n}
\exp\biglp -2πi(s↓1t↓1/m↓1 + \cdots + s↓nt↓n/m↓n)f(s↓1, \ldotss
, s↓n)\bigrp /m↓1 \ldotsm m↓n.$$
The inverse transform times $m↓1 \ldotsm m↓n$ can be found by
doing a regular transform and interchanging $t↓j$ with $m↓j
- t↓j$ when $t↓j ≠ 0$, cf.\ exercise 4.3.3--15.

$\biglp$If we regard $F(t↓1, \ldotss , t↓n)$
as the coefficient of $x↑{t↓1}↓{1} \ldotss x↑{t↓n}↓{n}$ in a
multivariate polynomial, the finite Fourier transform amounts
to evaluation of this polynomial at roots of unity, and the
inverse transform amounts to finding the interpolating polynomial.$\bigrp$

\ansno 14. Let $m↓1=\cdots=m↓n=2$, $F(t↓1, t↓2, \ldotss , t↓n) = F(2↑{n-1}t↓n
+ \cdots + 2t↓2 + t↓1)$, and $f(s↓1, s↓2, \ldotss , s↓n) = f(2↑{n-1}s↓1
+ 2↑{n-2}s↓2 + \cdots + s↓n)$; note the reversed treatment between
$t$'s and $s$'s. Also let $g↓k(s↓k, \ldotss , s↓n, t↓k)$ be $\omega$
raised to the $2↑{k-1}t↓k(s↓n + 2s↓{n-1} + \cdots + 2↑{n-k}s↓k)$
power.

At each iteration we essentially take $2↑{n-1}$ pairs of complex numbers $(α,β)$ and
replace them by $(α+\zeta β,α-\zeta β)$, where $\zeta$ is a suitable power of
$\omega$, hence $\zeta=\cos\theta+i\sin\theta$ for some $\theta$.
If we take advantage of simplifications when $\zeta=\pm1$ or $\pm i$, the total
work comes to $\biglp(n-3)\cdot 2↑{n-1}+2\bigrp$ complex multiplications and $n\cdot
2↑n$ complex additions; the techniques of exercise 41 can be used to reduce the
real multiplications and additions used to implement these complex operations.

The number of complex multiplications can be reduced about 25 per cent without
changing the number of additions by combining passes $k$ and $k+1$ for $k=1$, 3,
$\ldotss$; this means that $2↑{n-2}$ quadruples $(α,β,\gamma,\delta)$ are being
replaced by $(α+\zetaβ+\zeta↑2\gamma+\zeta↑3\delta$, $α+i\zetaβ-\zeta↑2\gamma-
i\zeta↑3\delta$, $α-\zetaβ+\zeta↑2\gamma-\zeta↑3\delta$, $α-i\zetaβ-\zeta↑2\gamma
+i\zeta↑3\delta)$. The number of complex multiplications when $n$ is even is
thereby reduced to $(3n-2)2↑{n-3}-5\lfloor2↑{n-1}/3\rfloor$.

These calculations assume that the given numbers $F(t)$ are complex. If the $F(t)$
are real, then $f(s)$ is the complex conjugate of $f(2↑n-s)$, so it is desirable to
avoid the redundancy by computing only the $2↑n$ independent real numbers $f(0)$,
$\real f(1)$, $\ldotss$, $\real f(2↑{n-1}-1)$, $f(2↑{n-1})$, $\imag 
f(1)$, $\ldotss$, $\imag f(2↑{n-1}-1)$. The entire calculation in this case can
be done by working with $2↑n$ real values, using the fact that $f↑{[k]}(s↓{n-k+1},
\ldotss,s↓n,t↓1,\ldotss,t↓{n-k})$ will be the complex conjugate of $f↑{[k]}(
s↓{n-k+1}↑\prime,\ldotss,s↓n↑\prime,t↓1,\ldotss,t↓{n-k})$ when $(s↓1\ldotsm s↓n)↓2
+(s↓1↑\prime\ldotsm s↓n↑\prime)↓2≡0\modulo{2↑n}$. About half as many multiplications
and additions are needed as in the complex case.

[The fast Fourier
transform algorithm is essentially due to C. Runge and H. K\"
onig in 1924, and it was generalized by J. W. Cooley and J.
W. Tukey, {\sl Math.\ Comp.\ \bf 19} (1965), 297--301. Its interesting
history has been traced by J. W. Cooley, P. A. W. Lewis, and P.
D. Welch, {\sl Proc.\ IEEE} {\bf 55} (1967), 1675--1677. Details
concerning its use have been discussed by R. C. Singleton, {\sl CACM
\bf 10} (1967), 647--654; M. C. Pease, {\sl JACM \bf 15} (1968),
252--264; G. D. Berglund, {\sl Math.\ Comp.\ \bf 22} (1968), 275--279,
{\sl CACM \bf 11} (1968), 703--710; A. M. Macnaghten and C. A. R. Hoare, {\sl Comp.\
J. \bf20} (1977), 78--83. See also exercises 53 and 57.]

\ansno 15. (a)\9 The hint follows by integration and induction.
Let $f↑{(n)}(\theta )$ take on all values between $A$ and $B$
inclusive, as $\theta$ varies from $\min(x↓0, \ldotss , x↓n)$
to $\max(x↓0, \ldotss , x↓n)$. Replacing $f↑{(n)}$ by these bounds,
in the stated integral, yields $A/n! ≤ f(x↓0, \ldotss , x↓n)
≤ B/n!$.\xskip (b) It suffices to prove this for $j = n$. Let $f$
be Newton's interpolation polynomial, then $f↑{(n)}$ is the
constant $n! α↓n$.

\ansno 16. Carry out the multiplications and additions of (43)
as operations on polynomials.\xskip (The special case $x↓0 = x↓1 =
\cdots = x↓n$ is considered in exercise 2. We have used this
method in step C8 of Algorithm 4.3.3C.)

\ansno 17. T. M. Vari has shown that $n - 1$ multiplications
are necessary, by proving that $n$ are necessary to compute
$x↑{2}↓{1} + \cdots + x↑{2}↓{n}$ [Cornell Computer Science report
120 (Jan. 1972)].

\ansno 18. $α↓0 = {1\over 2}(u↓3/u↓4 + 1)$, $β = u↓2/u↓4 - α↓0(α↓0
- 1)$, $α↓1 = α↓0β - u↓1/u↓4$, $α↓2 = β - 2α↓1$, $α↓3 = u↓0/u↓4 -
α↓1(α↓1 + α↓2)$, $α↓4 = u↓4$.

\ansno 19. Since $α↓5$ is the leading coefficient, we may assume
without loss of generality that $u(x)$ is monic (i.e., $u↓5
= 1$). Then $α↓0$ is a root of the cubic equation ${40z↑3 - 24u↓4z↑2}
+ (4u↑{2}↓{4} + 2u↓3)z + (u↓2 - u↓3u↓4) = 0$; this equation
always has at least one real root, and it may have three. Once
$α↓0$ is determined, we have $α↓3 = u↓4 - 4α↓0$, $α↓1 = u↓3 -
4α↓0α↓3 - 6α↑{2}↓{0}$, $α↓2 = u↓1 - α↓0(α↓0α↓1 + 4α↑{2}↓{0}α↓3
+ 2α↓1α↓3 + α↑{3}↓{0})$, $α↓4 = u↓0 - α↓3(α↑{4}↓{0} + α↓1α↑{2}↓{0}
+ α↓2)$.

For the given polynomial we are to solve the cubic
equation $40z↑3 - 120z↑2 + 80z = 0$; this leads to three solutions
$(α↓0, α↓1, α↓2, α↓3, α↓4, α↓5) = (0, -10, 13, 5, -5, 1)$, $(1,
-20, 68, 1, 11, 1)$, $(2, -10, 13, -3, 27, 1)$.

\ansno 20. $\vtop{\halign{\:t#\hfill\quad⊗\:t#\hfill\qquad⊗#\hfill\cr
LDA⊗X\cr
FADD⊗=$α↓3$=\cr
STA⊗TEMP1\cr
FADD⊗=$α↓0$-$α↓3$=\cr
STA⊗TEMP2\cr
FMUL⊗TEMP2\cr
\lower6pt\null STA⊗TEMP2\cr}}\hskip50pt
\vtop{\halign{\:t#\hfill\quad⊗\:t#\hfill\qquad⊗#\hfill\cr
FADD⊗=$α↓1$=\cr
FMUL⊗TEMP2\cr
FADD⊗=$α↓2$=\cr
FMUL⊗TEMP1\cr
FADD⊗=$α↓4$=\cr
FMUL⊗=$α↓5$=⊗\quad\blackslug\cr}}$

\ansno 21. $z = (x + 1)x - 2$, $w = (x + 5)z + 9$, $u(x) =
(w + z - 8)w - 8$; or $z = (x + 9)x + 26$, $w = (x - 3)z + 73$,
$u(x) = (w + z - 24)w - 12$.

\ansno 22. $α↓6 = 1$, $α↓0 = -1$, $α↓1 = 1$, $β↓1 = -2$, $β↓2 = -2$,
$β↓4 = 1$, $α↓3 = -4$, $α↓2 = 0$, $α↓4 = 4$, $α↓5 = -2$. We form $z =
(x - 1)x + 1$, $w = z + x$, and $u(x) = \biglp (z - x - 4)w +
4\bigrp z - 2$. This takes three multiplications and seven additions; in this
special case we see that another addition can be saved if we compute
$w=x↑2+1$, $z=w-x$.

\ansno 23. (a)\9 We may use induction on $n$; the result is trivial
if $n < 2$. If $f(0) = 0$, then the result is true for the polynomial
$f(z)/z$, so it holds for $f(z)$. If $f(iy) = 0$ for some real
$y ≠ 0$, then $g(\pm iy) = h(\pm iy) = 0$; since the result
is true for $f(z)/(z↑2 + y↑2)$, it holds also for $f(z)$. Therefore
we may assume that $f(z)$ has no roots whose real part is zero.
Now the net number of times the given path circles the origin
is the number of roots of $f(z)$ inside the region, which is
at most 1. When $R$ is large, the path $f(Re↑{it})$ for $π/2
≤ t ≤ 3π/2$ will circle the origin clockwise approximately $n/2$
times; so the path $f(it)$ for $-R ≤ t ≤ R$ must go counterclockwise
around the origin at least $n/2 - 1$ times. For $n$ even, this
implies that $f(it)$ crosses the imaginary axis at least $n
- 2$ times, and the real axis at least $n - 3$ times; for $n$
odd, $f(it)$ crosses the real axis at least $n - 2$ times and
the imaginary axis at least $n - 3$ times. These are roots respectively
of $g(it) = 0$, $h(it) = 0$.

(b)\9 If not, $g$ or $h$ would have a root of the
form $a + bi$ with $a ≠ 0$ and $b ≠ 0$. But this would imply
the existence of at least three other such roots, namely $a
- bi$ and $-a \pm bi$, while $g(z)$ and $h(z)$ have at most $n$ roots.
%folio 832 galley 8 (C) Addison-Wesley 1978	*
\ansno 24. The roots of $u$ are $-7$, $-3 \pm i$, $-2 \pm i$, and $-1$;
permissible values of $c$ are 2 and 4 (but {\sl not\/} 3, since
$c = 3$ makes the sum of the roots equal to zero).\xskip Case 1, $c
= 2$: $p(x) = (x + 5)(x↑2 + 2x + 2)(x↑2 + 1)(x - 1) = x↑6 + 6x↑5
+ 6x↑4 + 4x↑3 - 5x↑2 - 2x - 10$; $q(x) = 6x↑2 + 4x - 2 = 6(x +
1)(x - {1\over 3})$. Let $α↓2 = -1$, $α↓1 = {1\over 3}$; $p↓1(x)
= x↑4 + 6x↑3 + 5x↑2 - 2x - 10 = (x↑2 + 6x + {16\over 3})(x↑2
- {1\over 3}) - {74\over 9}$; $α↓0 = 6$, $β↓0 = {16\over 3}$, $β↓1
= -{74\over 9}$.\xskip Case 2, $c = 4$: A similar analysis gives $α↓2
= 9$, $α↓1 = -3$, $α↓0 = -6$, $β↓0 = 12$, $β↓1 = -26$.

\ansno 25. $β↓1 = α↓2$, $β↓2 = 2α↓1$, $β↓3 = α↓7$, $β↓4 = α↓6$, $β↓5
= β↓6 = 0$, $β↓7 = α↓1$, $β↓8 = 0$, $β↓9 = 2α↓1 - α↓8$.

\ansno 26. (a)\9 $λ↓1 = α↓1 \times λ↓0$, $λ↓2 = α↓2 + λ↓1$, $λ↓3
= λ↓2 \times λ↓0$, $λ↓4 = α↓3 + λ↓3$, $λ↓5 = λ↓4 \times λ↓0$, $λ↓6
= α↓4 + λ↓5$.\xskip (b) $\kappa ↓1 = 1 + β↓1x$, $\kappa ↓2 = 1 + β↓2\kappa
↓1x$, $\kappa ↓3 = 1 + β↓3\kappa ↓2x$, $u(x) = β↓4\kappa ↓3 = β↓1β↓2β↓3β↓4x↑3
+ β↓2β↓3β↓4x↑2 + β↓3β↓4x + β↓4$.\xskip (c) If any coefficient is zero,
the coefficient of $x↑3$ must also be zero in (b), while (a)
yields an arbitrary polynomial $α↓1x↑3 + α↓2x↑2 + α↓3x + α↓4$
of degree $≤3$.

\ansno 27. Otherwise there would be a nonzero polynomial $f(q↓n,
\ldotss , q↓1, q↓0)$ with integer coeffi\-cients such that $q↓n
\cdot f(q↓n, \ldotss , q↓1, q↓0) = 0$ for all sets $(q↓n, \ldotss
, q↓0)$ of real numbers. This cannot happen, since it is easy
to prove by induction on $n$ that a nonzero polynomial always
takes on some nonzero value.\xskip (Cf.\ exercise 4.6.1--16.
However, this result is false
for {\sl finite} fields in place of the real numbers.)

\ansno 28. The indeterminate quantities $α↓1$, $\ldotss$, $α↓s$
form an algebraic basis for the polynomial domain $Q[α↓1, \ldotss , α↓s]$, where $Q$
is the field of rational numbers. Since $s + 1$ is greater than
the number of elements in a basis, the polynomials $f↓j(α↓1,
\ldotss , α↓s)$ are algebraically dependent; this means
that there is a nonzero polynomial $g$ with rational coefficients
such that $g\biglp f↓0(α↓1, \ldotss , α↓s), \ldotss , f↓s(α↓1,
\ldotss , α↓s)\bigrp$ is identically zero.

\ansno 29. Given $j↓0$, $\ldotss$, $j↓t \in \{0, 1, \ldotss , n\}$,
there are nonzero polynomials with integer coefficients such
that $g↓j(q↓{j↓0}, \ldotss , q↓{j↓t}) = 0$ for all
$(q↓n, \ldotss , q↓0)$ in $R↓j$, $1 ≤ j ≤ m$. The product $g↓1g↓2
\ldotsm g↓m$ is therefore zero for all $(q↓n, \ldotss , q↓0)$
in $R↓1 ∪ \cdots ∪ R↓m$.

\ansno 30. Starting with the construction in Theorem M\null, we will
prove that $m↓p + (1 - \delta ↓{0m↓c})$ of the $β$'s
may effectively be eliminated: If $\mu↓i$ corresponds to
a parameter multiplication, we have $\mu ↓i = β↓{2i-1} \times
(T↓{2i} + β↓{2i})$; add $cβ↓{2i-1}β↓{2i}$ to each $β↓j$ for
which $c\mu ↓i$ occurs in $T↓j$, and replace $β↓{2i}$ by zero.
This removes one parameter for each parameter multiplication. If $\mu ↓i$ is the
first chain multiplication, then $\mu ↓i$ is the first chain
multiplication, then $\mu ↓i = (\gamma ↓1x + \theta ↓1 + β↓{2i-1})
\times (\gamma ↓{2x} + \theta ↓2 + β↓{2i})$, where $\gamma ↓1$,
$\gamma ↓2$, $\theta ↓1$, $\theta ↓2$ are polynomials in $β↓1$, $\ldotss
$, $β↓{2i-2}$ with integer coefficients. Here $\theta ↓1$ and
$\theta ↓2$ can be ``absorbed'' into $β↓{2i-1}$ and $β↓{2i}$,
respectively, so we may assume that $\theta ↓1 = \theta ↓2 =
0$. Now add $cβ↓{2i-1}β↓{2i}$ to each $β↓j$ for which $c\mu
↓i$ occurs in $T↓j$; add $β↓{2i-1}\gamma ↓2/\gamma ↓1$ to $β↓{2i}$;
and set $β↓{2i-1}$ to zero. The result set is unchanged by this
elimination of $β↓{2i-1}$, except for the values of $α↓1$, $\ldotss
$, $α↓s$ such that $\gamma ↓1$ is zero.\xskip $\biglp$This proof is essentially
due to V. J. Pan, {\sl Russian Mathematical Surveys} {\bf 21}
(1966), 105--136.$\bigrp$\xskip The latter case can be handled as in
the proof of Theorem A\null, since the polynomials with $\gamma ↓1
= 0$ can be evaluated by eliminating $β↓{2i}$ (as in the first
construction, where $\mu ↓i$ corresponds to a parameter multiplication).

\ansno 31. Otherwise we could add one parameter multiplication
as a final step, and violate Theorem C\null.\xskip (The exercise is an
improvement over Theorem A\null, in this special case, since there
are only $n$ degrees of freedom in the coefficients of a monic
polynomial of degree $n$.)

\ansno 32. $λ↓1 = λ↓0 \times λ↓0$, $λ↓2 = α↓1 \times λ↓1$, $λ↓3
= α↓2 + λ↓2$, $λ↓4 = λ↓3 \times λ↓1$, $λ↓5 = α↓3 + λ↓4$. We need
at least three multiplications to compute $u↓4x↑4$ (see Section
4.6.3), and at least two additions by Theorem A.

\ansno 33. We must have $n + 1 ≤ 2m↓c + m↓p + \delta ↓{0m↓c}
$, and $m↓c + m↓p = (n + 1)/2$; so there are no parameter multiplications.
Now the first $λ↓i$ whose leading coefficient (as a polynomial
in $x$) is not an integer must be obtained by a chain addition;
and there must be at least $n + 1$ parameters, so there are
at least $n + 1$ parameter additions.

\ansno 34. Transform the given chain step by step, and also
define the ``content'' $c↓i$ of $λ↓i$, as follows:\xskip (Intuitively,
$c↓i$ is the leading coefficient of $λ↓i$.)\xskip Define $c↓0 = 1$.
\xskip (a) If the step has the form $λ↓i = α↓j + λ↓k$, replace it by
$λ↓i=β↓j+λ↓k$, where $β↓j=α↓j/c↓k$; and define $c↓i = c↓k$.\xskip (b) If the step
has the form $λ↓i = α↓j - λ↓k$, replace it by $λ↓i = β↓j + λ↓k$,
where $β↓j = -α↓j/c↓k$; and define $c↓i = -c↓k$.\xskip (c) If the
step has the form $λ↓i = α↓j \times λ↓k$, replace it by $λ↓i
= λ↓k$ (the step will be deleted later); and define $c↓i = α↓jc↓k$.\xskip
(d) If the step has the form $λ↓i = λ↓j \times λ↓k$, leave it
unchanged; and define $c↓i = c↓jc↓k$.

After this process is finished, delete all steps
of the form $λ↓i = λ↓k$, replacing $λ↓i$ by $λ↓k$ in each future
step that uses $λ↓j$. Then add a final step $λ↓{r+1} = β \times
λ↓r$, where $β = c↓r$. This is the desired scheme, since it
is easy to verify that the new $λ↓i$ are just the old ones divided
by the factor $c↓i$. The $β$'s are given functions of the $α$'s;
division by zero is no problem, because if any $c↓k = 0$ we
must have $c↓r = 0$ (hence the coefficient of $x↑n$ is zero),
or else $λ↓k$ never contributes to the final result.

\ansno 35. Since there are at least five parameter steps, the
result is trivial unless there is at least one parameter multiplication;
considering the ways in which three multiplications can form
$u↓4x↑4$, we see that there must be one parameter multiplication
and two chain multiplications. Therefore the four addition-subtractions
must each be parameter steps, and exercise 34 applies. We can
now assume that only additions are used, and that we have a
chain to compute a general {\sl monic} fourth degree polynomial
with {\sl two} chain multiplications and four parameter additions.
The only possible scheme of this type that calculates a fourth
degree polynomial has the form
$$\baselineskip12pt\eqalign{λ↓1⊗= α↓1 + λ↓0\cr
λ↓2⊗= α↓2 + λ↓0\cr
λ↓3⊗=λ↓1\timesλ↓2\cr
λ↓4⊗=α↓3+λ↓3\cr
λ↓5⊗=α↓4+λ↓3\cr
λ↓6⊗=λ↓4\timesλ↓5\cr
λ↓7⊗=α↓5+λ↓6\cr}$$
Actually this chain has one addition too many,
but any correct scheme can be put into this form if we restrict
some of the $α$'s to be functions of the others. Now $λ↓7$
has the form $(x↑2 + Ax + B)(x↑2 + Ax + C) + D = x↑4 + 2Ax↑3
+ (E + A↑2)x↑2 + EAx + F$, where $A = α↓1 + α↓2$, $B = α↓1 +
α↓2$, $B = α↓1α↓2 + α↓3$, $C = α↓1α↓2 + α↓4$, $D = α↓6$, $E = B + C$,
$F = BC + D$; and since this involves only three independent
parameters it cannot represent a general monic fourth-degree
polynomial.

\ansno 36. As in the solution to exercise 35, we may assume that the chain computes
a general monic polynomial of degree six, using only three chain multiplications and
six parameter additions. The computation must take one of two general forms
$$\baselineskip12pt
\hbox to size{$\hfill\eqalign{λ↓1⊗=α↓1+λ↓0\cr
λ↓2⊗=α↓2+λ↓0\cr
λ↓3⊗=λ↓1\timesλ↓2\cr
λ↓4⊗=α↓3+λ↓0\cr
λ↓5⊗=α↓4+λ↓3\cr
λ↓6⊗=λ↓4\timesλ↓5\cr
λ↓7⊗=α↓5+λ↓6\cr
λ↓8⊗=α↓6+λ↓6\cr
λ↓9⊗=λ↓7\times λ↓8\cr
λ↓{10}⊗=α↓7+λ↓9\cr}\hfill
\hfill\eqalign{λ↓1⊗=α↓1+λ↓0\cr
λ↓2⊗=α↓2+λ↓0\cr
λ↓3⊗=λ↓1\timesλ↓2\cr
λ↓4⊗=α↓3+λ↓3\cr
λ↓5⊗=α↓4+λ↓3\cr
λ↓6⊗=λ↓4\timesλ↓5\cr
λ↓7⊗=α↓5+λ↓3\cr
λ↓8⊗=α↓6+λ↓6\cr
λ↓9⊗=λ↓7\times λ↓8\cr
λ↓{10}⊗=α↓7+λ↓9\cr}\hfill$}$$
where, as in exercise 35, an extra addition has been
inserted to cover a more general case. Neither of these schemes
can calculate a general sixth-degree monic polynomial, since
the first case is a polynomial of the form
$$(x↑3 + Ax↑2 + Bx + C)(x↑3 + Ax↑2 + Bx + D) + E,$$
and the second case (cf.\ exercise 35) is a polynomial of the form
$$\biglp x↑4 + 2Ax↑3 + (E + A↑2)x↑2 + EAx + F\bigrp (x↑2 +
Ax + G) + H;$$
both of these involve only five independent parameters.

\ansno 37. Let $u↑{[0]}(x) = u↓nx↑n + u↓{n-1}x↑{n-1} + \cdots
+ u↓0$, $v↑{[0]}(x) = x↑n + v↓{n-1}x↑{n-1} + \cdots + v↓0$. For
$1 ≤ j ≤ n$, divide $u↑{[j-1]}(x)$ by the monic polynomial $v↑{[j-1]}(x)$,
obtaining $u↑{[j-1]}(x) = α↓jv↑{[j-1]}(x) + β↓jv↑{[j]}(x)$.
Assume that a monic polynomial $v↑{[j]}(x)$ of degree $n - j$
exists satisfying this relation; this will be true for almost
all rational functions.
Let $u↑{[j]}(x) = v↑{[j-1]}(x) - xv↑{[j]}(x)$.
These definitions imply that deg$(u↑{[n]}) < 1$, so we may let
$α↓{n+1} = u↑{[n]}(x)$.

For the given rational function we have
$$\vbox{\halign{$\ctr{#}$⊗\qquad$\ctr{#}$⊗\qquad$\ctr{#}$⊗\qquad$\ctr{#}$\cr
α↓j⊗β↓j⊗v↑{[j]}(x)⊗u↑{[j]}(x)\cr
\noalign{\vskip3pt}
1⊗2⊗x + 5⊗3x + 19\cr
3⊗4⊗1⊗5\cr}}$$
so $u↑{[0]}(x)/v↑{[0]}(x) = 1 + 2/\biglp x + 3 + 4/(x + 5)\bigrp $.

{\sl Notes:} A general rational function of the stated
form has $2n + 1$ ``degrees of freedom,'' in the sense that
it can be shown to have $2n + 1$ essentially independent parameters.
If we generalize polynomial chains to ``arithmetic chains,''
which allow division operations as well as addition, subtraction,
and multiplication, we can obtain the following results with
slight modifications to the proofs of Theorems A and M\null: {\sl
An arithmetic chain with $q$ addition-subtraction steps has at
most $q + 1$ degrees of freedom. An arithmetic chain with $m$
multiplication-division
steps has at most $2m + 1$ degrees of freedom.} Therefore an arithmetic
chain that computes almost all rational functions of the stated
form must have at least $2n$ addition-subtractions, and $n$
multiplication-divisions; the method in this exercise is ``optimal.''
%folio 835 galley 9a (C) Addison-Wesley 1978	*
\ansno 38. The theorem is certainly true if $n = 0$. Assume
that $n$ is positive, and that a polynomial chain computing
$P(x; u↓0, \ldotss , u↓n)$ is given, where each of the parameters
$α↓j$ has been replaced by a real number. Let $λ↓i = λ↓j \times
λ↓k$ be the first chain multiplication step that involves one
of $u↓0$, $\ldotss$, $u↓n$; such a step must exist because of the
rank of $A$. Without loss of generality, we may assume that
$λ↓j$ involves $u↓n$; thus, $λ↓j$ has the form $h↓0u↓0 + \cdots
+ h↓nu↓n + f(x)$, where $h↓0$, $\ldotss$, $h↓n$ are real, $h↓n ≠
0$, and $f(x)$ is a polynomial with real coefficients.\xskip$\biglp
$The $h$'s and the coefficients of $f(x)$ are derived from the
values assigned to the $α$'s.$\bigrp$

Now change step $i$ to $λ↓i = α \times λ↓k$, where
$α$ is an arbitrary real number.\xskip (We could take $α = 0$; general
$α$ is used here merely to show that there is a certain amount
of flexibility available in the proof.)\xskip Add further steps to
calculate
$$λ = \biglp α - f(x) - h↓0u↓0 - \cdots - h↓{n-1}u↓{n-1}\bigrp/h↓n;$$
these new steps involve only additions and parameter
multiplications (by suitable new parameters). Finally, replace
$λ↓{-n-1} = u↓n$ everywhere in the chain by this new element
$λ$. The result is a chain that calculates
$$Q(x; u↓0, \ldotss , u↓{n-1}) = P\biglp x; u↓0, \ldotss , u↓{n-1},
(α - f(x) - h↓0u↓0 - \cdots - h↓{n-1}u↓{n-1})/h↓n\bigrp ;$$
and this chain has one less chain multiplication.
The proof will be complete if we can show that $Q$ satisfies
the hypotheses. The quantity $\biglp α - f(x)\bigrp /h↓n$ leads
to a possibly increased value of $m$, and a new vector $B↑\prime
$. If the columns of $A$ are $A↓0$, $A↓1$, $\ldotss$, $A↓n$ (these
vectors being linearly independent over the reals), the new
matrix $A↑\prime$ corresponding to $Q$ has the column vectors
$$A↓0 - (h↓0/h↓n)A↓n,\qquad \ldotss ,\qquad A↓{n-1} - (h↓{n-1}/h↓n)A↓n,$$
plus perhaps a few rows of zeros to account for
an increased value of $m$, and these columns are clearly also
linearly independent. By induction, the chain that computes
$Q$ has at least $n - 1$ chain multiplications, so the original
chain has at least $n$.

$\biglp$Pan showed also that the possibility of division gives no improvement;
cf.\ {\sl Problemy Kibernetiki \bf7} (1962), 21--30.
Generalizations to the computation of several
polynomials in several variables, with and without various kinds
of preconditioning, have been given by S. Winograd, {\sl Comm.\
Pure and Applied Math.\ \bf 23} (1970), 165--179.$\bigrp$

\ansno 39. By induction on $m$. Let $w↓m(x) = x↑{2m} + u↓{2m-1}x↑{2m-1}
+ \cdots + u↓0$, $w↓{m-1}(x) = x↑{2m-2} + v↓{2m-3}x↑{2m-3} + \cdots
+ v↓0$, $a = α↓1 + \gamma ↓m$, $b = α↓m$, and let $$\textstyle f(r) = \sum ↓{i,j≥0}
(-1)↑{i+j}{i+j\choose j}u↓{r+i+2j\,}a↑ib↑j.$$It follows that $v↓r
= f(r + 2)$ for $r ≥ 0$, and $\delta ↓m = f(1)$. If $\delta
↓m = 0$ and $a$ is given, we have a polynomial of degree $m
- 1$ in $b$, with leading coefficient $\pm (u↓{2m-1} - ma) =
\pm (\gamma ↓2 + \cdots + \gamma ↓m - m\gamma ↓m)$.

In Motzkin's unpublished notes he arranged to
make $\delta ↓k = 0$ almost always, by choosing $\gamma$'s
so that this leading coefficient is $≠0$ when $m$ is
even and $=0$ when $m$ is odd; then we almost always can let $b$
be a (real) root of an odd-degree polynomial.

\ansno 40. No; S. Winograd found a way to compute all polynomials
of degree 13 with only 7 (possibly complex) multiplications
[{\sl Comm.\ Pure and Applied Math.\ \bf 25} (1972), 455--457].
L. Revah found schemes that evaluate almost all polynomials
of degree $n ≥ 9$ with $\lfloor n/2\rfloor + 1$ (possibly complex)
multiplications [{\sl SIAM J. Computing} {\bf 4} (1975), 381--392];
she also showed that when $n=9$ it is possible to achieve $\lfloor n/2\rfloor+1$
multiplications only with at least $n+3$ additions. By appending sufficiently
many additions (cf.\ exercise 39), the ``almost all'' and ``possibly complex''
provisos disappear.\xskip V. J. Pan [{\sl Proc. ACM Symp. Theory Comp.\ \bf10}
(1978), 162--172] found schemes with $\lfloor n/2\rfloor+1$ (complex)
multiplications and the minimum number $n+2+\delta↓{n9}$ of (complex) additions, for
all odd $n≥9$; his method for $n=9$ is
$$\baselineskip14pt\cpile{v(x)=\biglp(x+α)↑2+β\bigrp(x+\gamma),\qquad
w(x)=v(x)+x,\cr t(x)=\biglp v(x)+\delta\bigrp\biglp w(x)+ε\bigrp-\biglp
v(x)+\delta↑\prime\bigrp\biglp w(x)+ε↑\prime\bigrp,\cr
u(x)=\biglp v(x)+\zeta\bigrp\biglp t(x)+\eta\bigrp+\kappa.\cr}$$
The minimum number of {\sl real\/} additions necessary, when the minimum number of
(real) multiplications is achieved, remains unknown for $n≥9$.

\ansno 41. $a(c + d) - (a + b)d + i\biglp a(c + d) + (b - a)c\bigrp$.\xskip
[Beware numerical instability. Three multiplications are necessary, since complex
multiplication is a special case of (69) with $p(u)=u↑2+1$.
Without the restriction on additions there are
other possibilities, e.g., the symmetric formula $ac-bd+i\biglp(a+b)(c+d)-ac-bd
\bigrp$ suggested by Peter Ungar in 1963; cf.\ Eq.\ 4.3.3 with $2↑n$
replaced by $i$.
See S. Winograd, {\sl Linear Alg.\ Appl.\ \bf4} (1971), 381--388.]

Alternatively, if $a↑2+b↑2=1$ and $t=(1-a)/b=b/(1+a)$, the algorithm ``$w=c-td$,
$v=d+bw$, $u=w-tv$'' for calculating the product $(a+bi)(c+di)=u+iv$ has been
suggested by Oscar Buneman [{\sl J. Comp. Phys.\ \bf12} (1973), 127--128]. In
this method if $a=\cos\theta$ and $b=\sin\theta$ then $t=\tan(\theta/2)$.

\ansno 42. (a)\9 Let $π↓1$, $\ldotss$, $π↓m$ be the $λ↓i$'s that correspond
to chain multiplications; then $π↓i = P↓{2i-1} \times P↓{2i}$
and $u(x) = P↓{2m+1}$, where each $P↓j$ has the form $β↓j +
β↓{j0}x + β↓{j1}π↓1 + \cdots + β↓{jr(j)}π↓{r(j)}$, where $r(j)
≤ \lceil j/2\rceil - 1$ and each of the $β↓j$ and $β↓{jk}$ is
a polynomial in the $α$'s with integer coefficients. We can
systematically modify the chain (cf.\ exercise 30) so that $β↓j
= 0$ and $β↓{jr(j)} = 1$, for $1 ≤ j ≤ 2m$; furthermore we can
assume that $β↓{30} = 0$. The result set now has at most $m
+ 1 + \sum ↓{1≤j≤2m} (\lceil j/2\rceil - 1) = m↑2 + 1$ degrees
of freedom.

(b)\9 Any such polynomial chain with at most $m$
chain multiplications can be simulated by one with the form
considered in (a), except that now we let $r(j) = \lceil j/2\rceil
- 1$ for $1 ≤ j ≤ 2m + 1$, and we do not assume that $β↓{30}
= 0$ or that $β↓{jr(j)} = 1$ for $j ≥ 3$. This single canonical
form involves $m↑2 + 2m$ parameters. As the $α$'s run thru all integers
and as we run through all chains, the $β$'s run through at most
$2↑{m↑2+2m}$ sets of values mod 2, hence the result
set does also. In order to obtain all $2↑n$ polynomials of degree
$n$ with 0-1 coefficients, we need $m↑2 + 2m ≥ n$.

(c)\9 Set $m ← \lfloor \sqrt{n}\rfloor$ and compute
$x↑2$, $x↑3$, $\ldotss$, $x↑m$. Let $u(x) = u↓{m+1}(x)x↑{(m+1)m} +
\cdots + u↓1(x)x↑m + u↓0(x)$, where each $u↓j(x)$ is a polynomial
of degree $≤m$ with integer coefficients (hence it can be evaluated
without any more multiplications). Now evaluate $u(x)$ by rule
(2) as a polynomial in $x↑m$ with known coefficients.\xskip (The number
of additions used is approximately the sum of the absolute values
of the coefficients, so this algorithm is efficient on 0-1
polynomials. Paterson and Stockmeyer also gave another algorithm
that uses about $\sqrt{2n}$ multiplications.)

Reference: {\sl SIAM J. Computing \bf2} (1973), 60--66; see also
J. E. Savage, {\sl SIAM J. Computing \bf3} (1974), 150--158.
For analogous results about
additions, see A. Borodin and S. Cook,
{\sl SIAM J. Computing \bf5} (1976), 146--157.

\ansno 43. When $a↓i = a↓j + a↓k$ is a step in some optimal
addition chain for $n + 1$, compute $x↑i = x↑jx↑k$ and $p↓i=
p↓kx↑j + p↓j$, where $p↓i = x↑{i-1} + \cdots + x + 1$; omit
the final calculation of $x↑{n+1}$. We save one multiplication
whenever $a↓k = 1$, in particular when $i = 1$.\xskip $\biglp$Cf.\
exercise 4.6.3--31 with $ε = {1\over 2}$.$\bigrp$

\ansno 44. It suffices to show that $(b↓{ijk})$'s rank is {\sl at most} that of
$(a↓{ijk})$, since we can obtain $(a↓{ijk})$ back from $(b↓{ijk})$ by transforming
it in the same way with $F↑{-1}$, $G↑{-1}$, $H↑{-1}$. If $a↓{ijk}=\sum↓{1≤l≤t}
α↓{il\,}β↓{jl\,}\gamma↓{kl}$ then it follows immediately that
$$\textstyle b↓{ijk}=\sum↓{1≤l≤t}\biglp\sum↓{1≤p≤m}F↓{ip\,}α↓{pl}\bigrp
\biglp\sum↓{1≤q≤n}G↓{jq\,}β↓{ql}\bigrp
\biglp\sum↓{1≤r≤s}H↓{kr\,}\gamma↓{rl}\bigrp.$$

[H. F. de Groote has proved that all normal schemes that yield $2\times2$ matrix
products with 7 chain multiplications are equivalent, in the sense that they can
be obtained from each other by nonsingular matrix multiplication as in this
exercise. In this sense Strassen's algorithm is unique.]

\ansno 45. By exercise 44 we can add any multiple of a row, column, or plane to
another one without changing the rank; we can also multiply a row, column,
or plane by a nonzero constant, or transpose the tensor. A sequence of such
operations can always be found to reduce a give $2\times2\times2$ tensor to one
of the forms ${0\,0\choose0\,0}{0\,0\choose0\,0}$, ${1\,0\choose0\,0}{0\,0\choose
0\,0}$, ${1\,0\choose0\,1}{0\,0\choose0\,0}$, ${1\,0\choose0\,0}{0\,0\choose0\,1}$,
${1\,0\choose0\,1}{0\,1\choose q\,r}$. The last tensor has rank 3 or 2 according as
the polynomial $u↑2-ru-q$ has one or two irreducible factors in the field of
interest, by Theorem W.

\ansno 46. A general $m\times n\times s$ tensor has $mns$ degrees of freedom. By
exercise 28 it is impossible to express all $m\times n\times s$ tensors in terms
of the $(m+n+s)t$ elements of a realization $A$, $B$, $C$ unless $(m+n+s)t≥mns$.
On the other hand, assume that $m≥n≥s$. The mank of an $m\times n$ matrix is at most
$n$, so we can realize any tensor in $ns$ chain multiplications by realizing each
matrix plane separately.\xskip[Exercise 45 shows that lower bound on the maximum
tensor rank is not best possible, nor is the upper bound. Thomas D. Howell
(Ph. D. thesis, Cornell Univ., 1976) has shown that there are tensors of rank
$≥\lceil mns/(m+n+s-2)\rceil$ over the complex numbers.]

\ansno 49. By Lemma T\null, rank$(a↓{ijk})≥\hbox{rank}(a↓{i(jk)})$. Conversely if
$M$ is a matrix of rank $t$ we can transform it by row and column operations,
finding nonsingular matrices $F$ and $G$ such that $FMG$ has all entries 0 except
for $t$ diagonal elements that are 1; cf.\ Algorithm 4.6.2N\null. The tensor rank
of $FMG$ is therefore $≤t$; and it is the same as the tensor rank of $M$, by
exercise 44.

\ansno 50. Let $i=\langle i↑\prime,i↑{\prime\prime}\rangle$ where $1≤i↑\prime≤m$
and $1≤i↑{\prime\prime}≤n$; then $a↓{\langle i↑\prime,i↑{\prime\prime}\rangle jk}
=\delta↓{i↑{\prime\prime}j}\delta↓{i↑\prime k}$, and it is clear that
rank$(a↓{i(jk)})=mn$ since $(a↓{i(jk)})$ is a permutation matrix. By Lemma\penalty
999\ L\null,
rank$(a↓{ijk})≥mn$. Conversely, since $(a↓{ijk})$ has only $mn$ nonzero entries,
its rank is clearly $≤mn$.\xskip(There is consequently no normal scheme requiring
fewer than the $mn$ obvious multiplications. There is no such abnormal scheme
either [{\sl Comm.\ Pure and Appl.\ Math.\ \bf3} (1970), 165-179]. But some
savings can be achieved if the same matrix is used with $s>1$ different column
vectors, since this is equivalent to $(m\times n)$ times $(n\times s)$ matrix
multiplication.)

\ansno 51. (a)\9 $s↓1=y↓0+y↓1$, $s↓2=y↓0-y↓1$; $m↓1={1\over2}(x↓0+x↓1)s↓1$,
$m↓2={1\over2}(x↓0-x↓1)s↓2$; $w↓0=m↓1+m↓2$, $w↓1=m↓1-m↓2$.\xskip(b) Here are some
intermediate steps, using the methodology in the text: $\biglp(x↓0-x↓2)+(x↓1-x↓2)u
\bigrp\biglp(y↓0-y↓2)+(y↓1-y↓2)u\bigrp\mod(u↑2+u+1)=\biglp(x↓0-x↓2)(y↓0-y↓2)-
(x↓1-x↓2)(y↓1-y↓2)\bigrp+\biglp(x↓0-x↓2)(y↓0-y↓2)-(x↓1-x↓0)(y↓1-y↓0)\bigrp u$.
The first realization is
$$\def\\#1.#2.#3.{\biggglp\,\vcenter{\vskip-4pt
\halign{##\cr\vbox to 9pt{}#1\cr#2\cr#3\cr}}\,\bigggrp}
\\1 1 \=1 0.1 0 1 1.1 \=1 0 \=1.,\qquad
\\1 1 \=1 0.1 0 1 1.1 \=1 0 \=1.,\qquad
\\1 1 1 \=2.1 1 \=2 1.1 \=2 1 1.\times\textstyle{1\over3}.$$
The second realization is
$$\def\\#1.#2.#3.{\biggglp\,\vcenter{\vskip-4pt
\halign{##\cr\vbox to 9pt{}#1\cr#2\cr#3\cr}}\,\bigggrp}
\\1 1 1 \=2.1 1 \=2 1.1 \=2 1 1.\times\textstyle{1\over3},\qquad
\\1 1 \=1 0.1 \=1 0 \=1.1 0 1 1.,\qquad
\\1 1 \=1 0.1 0 1 1.1 \=1 0 \=1..$$
The resulting algorithm computes $s↓1=y↓0+y↓1$, $s↓2=y↓0-y↓1$, $s↓3=y↓2-y↓0$,
$s↓4=y↓2-y↓1$, $s↓5=s↓1+y↓2$; $m↓1={1\over3}(x↓0+x↓1+x↓2)s↓5$,
$m↓2={1\over3}(x↓0+x↓1-2x↓2)s↓2$, $m↓3={1\over3}(x↓0-2x↓1+x↓2)s↓3$,
$m↓4={1\over3}(-2x↓0+x↓1+x↓2)s↓4$; $t↓1=m↓1+m↓2$, $t↓2=m↓1-m↓2$, $t↓3=m↓1+m↓3$,
$w↓0=t↓1-m↓3$, $w↓1=t↓3+m↓4$, $w↓2=t↓2-m↓4$.

\def\dprime{{\prime\prime}}
\ansno 52. Let $i=\langle i↑\prime,i↑\dprime\rangle$ when $i\mod n↑\prime=i↑\prime$
and $i\mod n↑\dprime=i↑\dprime$. Then we wish to compute $w↓{\langle k↑\prime,
k↑\dprime\rangle}=\sum x↓{\langle i↑\prime,i↑\dprime\rangle}y↓{\langle j↑\prime,
j↑\dprime\rangle}$ summed for $i↑\prime+j↑\prime≡k↑\prime\modulo{n↑\prime}$ and
$i↑\dprime+j↑\dprime≡k↑\dprime\modulo{n↑\dprime}$. This can be done by applying
the $n↑\prime$ algorithm to the $2n↑\prime$ vectors $X↓{i↑\prime}$ and $Y↓{j↑\prime
}$ of length $n↑\dprime$, obtaining the $n↑\prime$ vectors $W↓{k↑\prime}$. Each
vector addition becomes $n↑\dprime$ additions, each parameter multiplication
becomes $n↑\dprime$ parameter multiplications, and each chain multiplication of
vectors is replaced by a cyclic convolution of degree $n↑\dprime$.\xskip $\biglp$If
the subalgorithms use the minimum number of chain multiplications, this algorithm
uses $2\biglp n↑\prime-d(n↑\prime)\bigrp\biglp n↑\dprime-d(n↑\dprime)\bigrp$ more
than the minimum, where $d(n)$ is the number of divisors of $n$.$\bigrp$

\ansno 53. (a)\9 Let $n(k)=(p-1)p↑{e-k-1}=\varphi(p↑{e-k})$ for $0≤k<e$, and $n(k)=
1$ for $k≥e$. Represent the numbers $\{1,\ldotss,m\}$ in the form $a↑ip↑k\modulo m$,
where $0≤k≤e$ and $a$ is a fixed primitive element modulo $p↑e$. For example,
when $m=9$ we can let $a=2$; the values are $\{2↑03↑0,2↑13↑0,2↑03↑1,2↑23↑0,2↑53↑0,
2↑13↑1,2↑43↑0,2↑33↑0,2↑03↑2\}$. Then $f(a↑ip↑k)=\sum↓{0≤l≤e}\sum↓{0≤j<n(l)}
\omega↑{g(i,j,k,l)}F(a↑jp↑l)$ where $g(i,j,k,l)=a↑{i+j}p↑{k+l}$.

We shall compute $f↓{ikl}=\sum↓{0≤j<n(l)}\omega↑{g(i,j,k,l)}F(a↑jp↑l)$ for $0≤i<
n(k)$ and for each $k$, $l$. This is a cyclic convolution of degree $n(k+l)$ on
the values $x↓i=\omega↑{a↑ip↑{k+l}}$ and $y↓s=\sum↓{0≤j<n(l),\,s+j≡0\modulo{n(k+l)}}
F(a↑jp↑l)$, since $f↓{ikl}=\sum x↓ry↓s$ summed over $r+s≡i$ $\biglp$modulo
$n(k+l)\bigrp$. The Fourier transform is obtained by summing appropriate
$f↓{ikl}$'s.\xskip$\biglp${\sl Note:} When linear combinations of the $x↓i$ are
formed, e.g., as in (67), the result will be purely real or purely imaginary,
when the cyclic convolution algorithm has been constructed by using rule (57) with
$u↑{n(k)}-1=(u↑{n(k)/2}-1)(u↑{n(k)/2}+1)$. The reason is that reduction mod
$(u↑{n(k)/2}-1)$ produces a polynomial with real coefficients $\omega↑j+\omega↑{-j}$
while reduction mod $(u↑{n(k)/2}+1)$ produces a polynomial with imaginary
coefficients $\omega↑j-\omega↑{-j}$.$\bigrp$

When $p=2$ an analogous construction applies, using the representation $(-1)↑ia↑j
2↑k\modulo m$, where $0≤k≤e$ and $0≤i≤\min(e-k,1)$ and $0≤j<2↑{e-k-2}$. In this
case we use the construction of exercise 52 with $n↑\prime=2$ and $n↑\dprime=
2↑{e-k-2}$; although these numbers are not relatively prime, the construction does
yield the desired direct product of cyclic convolutions.

(b)\9 Let $a↑\prime m↑\prime+a↑\dprime m↑\dprime=1$; and let $\omega↑\prime=
\omega↑{a↑\dprime m↑\dprime}$, $\omega↑\dprime=\omega↑{a↑\prime m↑\prime}$. Define
$s↑\prime=s\mod m↑\prime$, $s↑\dprime=s\mod m↑\dprime$, $t↑\prime=t\mod m↑\prime$,
$t↑\dprime=t\mod m↑\dprime$, so that $\omega↑{st}=(\omega↑\prime)↑{s↑\prime
t↑\prime}(\omega↑\dprime)↑{s↑\dprime t↑\dprime}$. It follows that $f(s↑\prime,s↑
\dprime)=\sum↓{0≤t↑\prime<m↑\prime,\,0≤t↑\dprime<m↑\dprime}(\omega↑\prime)↑{s↑
\prime t↑\prime}(\omega↑\dprime)↑{s↑\dprime t↑\dprime}F(t↑\prime,t↑\dprime)$; in
other owrds, the one-dimensional Fourier transform on $m$ elements is actually a
two-dimensional Fourier transform on $m↑\prime\times m↑\dprime$ elements, in
slight disguise.

We shall deal with ``normal'' algorithms consisting of (i) a number of sums $s↓i$
of the $F$'s and $s$'s; followed by (ii) a number of products $m↓j$, each of which
is obtained by multiplying one of the $F$'s or $S$'s by a real or imaginary number
$α↓j$; followed by (iii) a number of further sums $t↓k$, each of which is formed
from $m$'s or $t$'s (not $F$'s or $s$'s). The final values must be $m$'s or $t$'s.
For example, the ``normal'' Fourier transform scheme for $m=5$ constructed from
(67) and the method of part (a) is as follows: $s↓1=F(1)+F(4)$, $s↓2=F(3)+F(2)$,
$s↓3=s↓1+s↓2$, $s↓4=s↓1-s↓2$, $s↓5=F(1)-F(4)$, $s↓6=F(2)-F(3)$, $s↓7=s↓5-s↓6$;
$m↓1={1\over4}(\omega+\omega↑2+\omega↑4+\omega↑3)s↓3$, $m↓2={1\over4}(\omega-
\omega↑2+\omega↑4-\omega↑3)s↓4$, $m↓3={1\over2}(\omega+\omega↑2-\omega↑4-\omega↑3)
s↓5$, $m↓4={1\over2}(-\omega+\omega↑2+\omega↑4-\omega↑3)s↓6$, $m↓5={1\over2}(
\omega↑3-\omega↑2)s↓7$, $m↓6=1\cdot F(5)$, $m↓7=1\cdot s↓3$; $t↓0=m↓1+m↓6$, $t↓1=
t↓0+m↓2$, $t↓2=m↓3+m↓5$, $t↓3=t↓0-m↓2$, $t↓4=m↓4-m↓5$, $t↓5=t↓1+t↓2$, $t↓6=t↓3+t↓4$,
$t↓7=t↓1-t↓2$, $t↓8=t↓3-t↓4$, $t↓9=m↓6+m↓7$. Note the multiplication by 1 shown in
$m↓6$ and $m↓7$; this is required by our conventions, and it is important to
include such cases for use in recursive constructions (although the multiplications
need not really be done). Here $m↓6=f↓{001}$, $m↓7=f↓{010}$, $t↓5=f↓{000}+f↓{001}
=f(2↑0)$, $t↓6=f↓{100}+f↓{101}=f(2↑1)$, etc. We can improve the scheme by
introducing $s↓8=s↓3+F(5)$, replacing $m↓1$ by $\biglp{1\over4}(\omega+\omega↑2
+\omega↑4+\omega↑3)-1\bigrp s↓3$ [this is $-{5\over4}s↓3$], replacing $m↓6$ by
$1\cdot s↓8$, and deleting $m↓7$ and $t↓9$; this saves one of the trivial
multiplications by 1, and it will be advantageous when the scheme is used to
build larger ones. In the improved scheme, $f(5)=m↓6$, $f(1)=t↓5$, $f(2)=t↓6$,
$f(3)=t↓8$, $f(4)=t↓7$.

Now suppose we have normal one-dimensional schemes for $m↑\prime$ and $m↑\dprime$,
using respectively $(a↑\prime,a↑\dprime)$ complex additions, $(t↑\prime,t↑\dprime)$
trivial multiplications by $\pm1$ or $\pm i$, and a total of $(c↑\prime,c↑\dprime)$
complex multiplications including the trivial ones.\xskip(The nontrivial complex
multiplications are all ``simple'' since they involve only two real multiplications
and no real additions.)\xskip We can construct a normal scheme for the
two-dimensional $m↑\prime\times m↑\dprime$ case by applying the $m↑\prime$ scheme
to vectors $F(t↑\prime,\ast)$ of length $m↑\dprime$. Each $s↓i$ step becomes
$m↑\dprime$ additions; each $m↓j$ becomes a Fourier transform on $m↑\dprime$
elements, but with all of the $α$'s in this algorithm multiplied by $α↓j$; and
each $t↓k$ becomes $m↑\dprime$ additions. Thus the new algorithm has $(a↑\prime
m↑\dprime+c↑\prime a↑\dprime)$ complex additions, $t↑\prime t↑\dprime$ trivial
multiplications, and a total of $c↑\prime c↑\dprime$ complex multiplications.

Using these techniques, Winograd has constructed normal one-dimensional schemes
for the following small values of $m$ with the following costs $(a,t,c)$:
$$\baselineskip13pt
\vbox{\halign{$m=#\hfill$⊗\quad$(\hfill#$,⊗$\,#,$⊗$\,\hfill#)$⊗\hskip 100pt
$m=#\hfill$⊗\quad$(\hfill#$,⊗$\,#,$⊗$\,\hfill#)$\cr
2⊗2⊗2⊗2⊗7⊗36⊗1⊗9\cr
3⊗6⊗1⊗3⊗8⊗26⊗6⊗8\cr
4⊗8⊗4⊗4⊗9⊗46⊗1⊗12\cr
5⊗17⊗1⊗6⊗16⊗74⊗8⊗18\cr}}$$
By combining these schemes as described above, we obtain methods that use fewer
arithmetic operations than the ``fast Fourier transform'' (FFT) discussed in
exercise 14. For example, when $m=1008=7\cdot9\cdot16$, the costs come to
$(17946,8,1944)$, so we can do a Fourier transform on 1008 complex numbers with
3872 real multiplications and 35892 real additions. By contrast, the FFT on 1024
complex numbers involves 14344 real multiplications and 27652 real additions.
If the two-passes-at-once improvement in the answer to exercise 14 is used,
however, the FFT on 1024 complex numbers needs only 10936 real multiplications and
25948 additions, and it is not difficult to implement. Therefore Winograd's method
is faster only on machines that take significantly longer to multiply than to add.

[Reference: {\sl Proc.\ Nat.\ Acad.\ Sci.\ USA \bf73} (1976), 1005--1006.]

\ansno 54. $\max\biglp 2e↓1\hbox{deg}(p↓1)-1,\ldotss,2e↓r\hbox{deg}(p↓r)-1,r+1)$.

\ansno 55. $2n↑\prime-r↑\prime$, where $n↑\prime$ is the degree of the minimum
polynomial of $P$ (i.e., the monic
polynomial $\mu$ of least degree such that $\mu(P)$ is the zero matrix) and
$r↑\prime$ is the number of distinct irreducible factors it has.\xskip(Reduce $P$
by similarity transformations.)

\ansno 56. Let $b↓{ijk}+b↓{jik}=c↓{ijk}+c↓{jik}$, for all $i$, $j$, $k$. If
$A$, $B$, $C$ is a realization of $(b↓{ijk})$ of rank $t$, then $\sum↓{1≤l≤t}
\gamma↑{kl\,}\biglp\sum\alpha↓{il\,}x↓i\bigrp\biglp\sumβ↓{jl\,}x↓j\bigrp=
\sum↓{i,j}b↓{ijk}x↓ix↓j=\sum↓{i,j}c↓{ijk}x↓ix↓j$ for all $k$. Conversely, let the
$l$th chain multiplication of a polynomial chain, for $1≤l≤t$, be the product
$\biglp α↓l+\sum α↓{il\,}x↓i\bigrp\biglp β↓l+\sum β↓{jl}x↓j\bigrp$, where
$α↓l$ and $β↓l$ denote possible constant terms and/or nonlinear terms. All terms
of degree 2 appearing at any step of the chain can be expressed as a linear
combination $\sum↓{1≤l≤t}\gamma↓{l\,}\biglp\sum α↓{il\,}x↓i\bigrp\biglp\sum
β↓{jl\,}x↓j\bigrp$; hence the chain defines a tensor $(b↓{ijk})$ of rank $≤t$
such that $b↓{ijk}+b↓{jik}=c↓{ijk}+c↓{jik}$. This establishes the hint. Now
\def\\{\hbox{rank}}$\\(c↓{ijk}+c↓{jik})=\\(b↓{ijk}+b↓{jik})≤\\(b↓{ijk})+\\(b↓
{jik})=2\\(b↓{ijk})$.

A bilinear form in $x↓1$, $\ldotss$, $x↓m$, $y↓1$, $\ldotss$, $y↓n$ is a
quadratic form in $m+n$ variables, where $c↓{ijk}=a↓{i,j-m,k}$ for $i≤m$ and $j>m$,
otherwise $c↓{ijk}=0$. Now $\\(c↓{ijk})+\\(c↓{jik}≥\\(a↓{ijk})$, since we obtain
a realization of $(a↓{ijk})$ by suppressing the last $n$ rows of $A$ and the first
$m$ rows of $B$ in a realization $A$, $B$, $C$ of $(c↓{ijk}+c↓{jik})$.

\ansno 57. Let $N$ be the smallest power of 2 that exceeds
$2n$, and let $u↓{n+1} = \cdots = u↓{N-1} = v↓{n+1} = \cdots
= v↓{N-1} = 0$. If $U↓s = \sum ↓{0≤t<N} u↓t\omega ↑{st}$, $V↓s
= \sum ↓{0≤t<N} v↓t\omega ↑{st}$, $0 ≤ s < N$, $\omega = e↑{2πi/N}$,
then $\sum ↓{0≤s<N} U↓sV↓s\omega ↑{-st} = N \sum u↓{t↓1}v↓{t↓2}$, where
the latter sum is taken over all $t↓1$ and $t↓2$
with $0 ≤ t↓1, t↓2 < N$, $t↓1 + t↓2 ≡ t\modulo N$. The terms
vanish unless $t↓1 ≤ n$ and $t↓2 ≤ n$, so $t↓1 + t↓2 < N$; thus the
sum is the coefficient of $z↑t$ in the product $u(z)v(z)$. If
we use the method of exercise 14 to compute the Fourier transforms and
the inverse transforms, the number of complex operations is
$O(N\log N) + O(N \log N) + O(N) + O(N\log N)$; and $N
< 4n$.\xskip [Cf.\ Section 4.3.3 and the paper by J. M. Pollard, {\sl Math.\
Comp.\ \bf 25} (1971), 365--374.]

The number $\omega$ cannot be represented
exactly as a pair of floating-point values
inside a computer, but V. Strassen has shown that it
isn't necessary to have too much accuracy to deduce exact results
when the coefficients are integers [cf.\ {\sl Computing} {\bf 7}
(1971), 281--292].

When multiplying integer polynomials, it is possible to use an {\sl integer} number
$\omega$ that is of order $2↑t$ modulo a prime $p$, and to
determine the results modulo sufficiently many primes. Useful
primes in this regard, together with their least primitive roots
$r$ (from which we take $\omega = r↑{(p-1)/2↑t}\mod
p$ when $p \mod 2↑t = 1)$, can be found as described in Section
4.5.4. For $t = 9$, the ten largest cases $<2↑{35}$ are $p = 2↑{35}
- 512a + 1$, where $(a, r) = (28, 7)$, $(31, 10)$, $(34, 13)$, $(56,
3)$, $(58, 10)$, $(76, 5)$, $(80, 3)$, $(85, 11)$, $(91, 5)$, $(101, 3)$;
the ten largest cases $<2↑{31}$ are $p = 2↑{31} - 512a + 1$, where
$(a, r) = (1, 10)$, $(11, 3)$, $(19, 11)$, $(20, 3)$, $(29, 3)$, $(35,
3)$, $(55, 19)$, $(65, 6)$, $(95, 3)$, $(121, 10)$. For larger $t$,
all primes $p$ of the form $2↑tq + 1$ where $q < 32$ is odd
and $2↑{24} < p < 2↑{36}$ are given by $(p - 1, r) = (11 \cdot
2↑{21}, 3)$, $(25 \cdot 2↑{20}, 3)$, $(27 \cdot 2↑{20}, 5)$, $(25
\cdot 2↑{22}, 3)$, $(27 \cdot 2↑{22}, 7)$, $(5 \cdot 2↑{25}, 3)$,
$(7 \cdot 2↑{25}, 3)$, $(7 \cdot 2↑{26}, 3)$, $(27 \cdot 2↑{26},
13)$, $(15 \cdot 2↑{27}, 31)$, $(17 \cdot 2↑{27}, 3)$, $(3 \cdot 2↑{30},
5)$, $(13 \cdot 2↑{28}, 3)$, $(29 \cdot 2↑{27}, 3)$, $(23 \cdot
2↑{29}, 5)$. Some of the latter primes can be used with $\omega
= 2↑e$ for appropriate small $e$. For a discussion of such primes, see R. M.
Robinson, {\sl Proc.\ Amer.\ Math.\ Soc.\ \bf9} (1958), 673--681; S. W. Golomb,
{\sl Math.\ Comp.\ \bf30} (1976), 657--663.

\ansno 58. (a)\9 In general if $A$, $B$, $C$ realizes $(a↓{ijk})$, then
$(x↓1,\ldotss,x↓m)A$, $B$, $C$ is a realization of the $1\times n\times s$ matrix
whose entry in row $j$, column $k$ is $\sum x↓ia↓{ijk}$. So there must be at
least as many nonzero elements in $(x↓1,\ldotss,x↓m)A$ as the rank of this matrix.
In the case of the $m\times n\times(m+n-1)$ tensor corresponding to polynomial
multiplication of degree $m-1$ by degree $n-1$, the corresponding matrix has
rank $n$ whenever $(x↓1,\ldotss,x↓m)≠(0,\ldotss,0)$. A similar statement holds
with $A↔B$ and $m↔n$.\xskip$\biglp$In particular, if we work over the field of
2 elements, this says that the rows of $A$ modulo 2 form a ``linear code'' of
$m$ vectors having distance at least $n$, whenever $A$, $B$, $C$ is a realization
consisting entirely of integers. This observation, due to R. W. Brockett and
D. Dobkin [{\sl Linear Algebra and its Applic.\ \bf19} (1978), 207--235, Theorem
14], can be used to obtain nontrivial lower bounds on the rank over the integers.
For example, M. R. Brown and D. Dobkin have used it to show that realizations 
of $n\times n$ polynomial multiplication over the integers must have $t≥3.52n$,
for all sufficiently large $n$ [to appear].$\bigrp$

\yskip
(b) $\left(\vcenter{\halign{#\cr
1 0 0 0 0 1 1 1\cr 0 1 0 0 1 1 0 1\cr 0 0 1 1 0 0 1 1\cr}}\right)$,
$\left(\vcenter{\halign{#\cr
1 0 0 0 0 1 1 1\cr 0 1 0 0 0 1 0 1\cr 0 0 1 0 0 0 1 1\cr 0 0 0 1 1 0 0 1\cr}}
\right)$,
$\left(\vcenter{\halign{#\cr 1 0 0 0 0 0 0 0\cr
\=1 \=1 0 0 0 1 0 0\cr \=1 1 \=1 0 0 0 1 0\cr 1 0 0 \=1 \=1 \=1 \=1 1\cr
0 0 1 0 1 0 0 0\cr 0 0 0 1 0 0 0 0\cr}}\right)$.

\ansno 59. (a)/9 In $\Sigma↓1$, for example, we can group all terms having a
common value of $j$ and $k$ into a single trilinear term; this gives $\nu↑2$
trilinear terms when $(j,k)\in E{\times}E$, plus $\nu↑2$ when $(j,k)\in E{\times}O$
and $\nu↑2$ when $(j,k)\in O{\times}E$. When $\s\jit=k$ we can also include
$-x↓{jj\,}y↓{j\s\jit\,}z↓{\s\jit j\,}$ in $\Sigma↓1$, free of charge.\xskip
[In the case $n=10$, the method multiplies 10 by 10 matrices with 710
multiplications; this is fewer than required by any other known method.]

(b)\9 Corresponding to the left-hand side of the stated identity we get the
terms$$x↓{i+ε,j+\zeta\,}y↓{j+\zeta,k+\eta\,}z↓{k+\eta,i+ε}+x↓{j+\eta,k+ε\,}
y↓{k+ε,i+\zeta\,}z↓{i+\zeta,j+\eta}+x↓{k+\zeta,i+\eta\,}y↓{i+\eta,j+ε\,}
z↓{j+ε,k+\zeta}$$summed over $(i,j,k)\in S$ and $0≤ε,\zeta,\eta≤1$, so we get all
the trilinear terms of the form $x↓{ij\,}y↓{jk\,}z↓{ki}$ except when $\lceil i/2
\rceil=\lceil j/2\rceil=\lceil k/2\rceil$; however, these missing terms can
all be included in $\Sigma↓1$, $\Sigma↓2$, or $\Sigma↓3$. The sum $\Sigma↓1$
turns out to include terms of the form $x↓{i+ε,j+\zeta\,}y↓{i+\eta,j+ε}$ times
some sum of $z$'s, so it contributes $8\nu↑2$ terms to the trilinear realization;
and $\Sigma↓2$, $\Sigma↓3$ are similar. To verify that the $aB\Cscr$ terms
cancel out, note that they are $\sum(-1)↑{\zeta+\eta}x↓{i+ε,j+\zeta\,}
y↓{k+ε,i+\zeta\,}z↓{j+ε,k+\zeta}$, so $\eta=1$ cancels with $\eta=0$.\xskip
[This technique leads to asymptotic improvements over Strassen's method whenever
${1\over3}n↑3+6n↑2-{4\over3}n<n↑{\lg7}$, namely when $36≤n≤184$. The best case
occurs for $n=70$, whence it follows that matrices can be multiplied in
$O(n↑α)$ steps for $α=\log 143640/\!\log 70=2.79512$. By refining this
approach, Pan has shown that $M(n,n,n)≤{1\over3}n↑3+{9\over2}n↑2+{2\over3}n$,
leading to the improved exponent $α=\log 47264/\!\log48=2.780404$. Reference:
{\sl Proc.\ IEEE Symp.\ Foundations of Comp.\ Sci.\ \bf19} (1978), 166--176.]
%folio 840 galley 9b (C) Addison-Wesley 1978	*
\ansbegin{4.7}

\ansno 1. Find the first nonzero coefficient
$V↓m$, as in (4), and divide both $U(z)$ and $V(z)$ by
$z↑m$ (shifting the coefficients $m$ places to the left). The
quotient will be a power series iff $U↓0 = \cdots = U↓{m-1}
= 0$.

\ansno 2. $V↑{n+1}↓{\!0}W↓n = V↑{n}↓{\!0}U↓n - (V↑{1}↓{\!0}W↓0)(V↑{n-1}↓{\!0}V↓n)
- (V↑{2}↓{\!0}W↓1)(V↑{n-2}↓{\!0}V↓{n-1}) - \cdots - (V↑{n}↓{\!0}W↓{n-1})(V↑{0}↓{\!0}
V↓1)$.
Thus, we start by replacing $(U↓j, V↓j)$ by $(V↑{j}↓{\!0}U↓j,
V↑{j-1}↓{\!0}V↓j)$ for $j ≥ 1$, then set $W↓n ← U↓n - \sum ↓{0≤k<n}
W↓kV↓{n-k}$ for $n ≥ 0$, finally replace $W↓j$ by $W↓j/V↑{j+1}↓{\!0}$
for $j ≥ 0$. Similar techniques are possible in connection with other algorithms
in this section.

\ansno 3. Yes. When $α = 0$, it is easy to prove by induction
that $W↓1 = W↓2 = \cdots = 0$. When $α = 1$, we find $W↓n =
V↓n$, by the ``cute'' identity
$$\sum ↓{1≤k≤n}\left({k - (n - k)\over n}\right)V↓kV↓{n-k} = V↓0V↓n.$$

\ansno 4. If $W(z) = e↑{V(z)}$, then $W↑\prime (z)
= V↑\prime (z)W(z)$; we find $W↓0 = 1$, and
$$W↓n = \sum ↓{1≤k≤n} {k\over n} V↓kW↓{n-k},\qquad\hbox{for }n ≥ 1.$$
If $W(z) = \ln\biglp 1 + V(z)\bigrp $, then $W↑\prime
(z) + W↑\prime (z)V(z) = V↑\prime (z)$; the rule is $W↓0 = 0$,
and $W↓1 + 2W↓2z + \cdots = V↑\prime (z)/\biglp 1 + V(z)\bigrp$.

$\biglp$By exercise 6, the logarithm can be obtained
to order $n$ in $O(n\log n)$ operations.\xskip R. P. Brent observes
that $\exp\biglp V(z)\bigrp$ can also be calculated with this
asymptotic speed by applying Newton's method to $f(x) = \ln
x - V(z)$; therefore general exponentiation $\biglp 1 + V(z)\bigrp ↑α = \exp\biglp
α \ln(1 + V(z))\bigrp$ is $O(n \log n)$ too.\xskip[{\sl Analytic
Computational Complexity}, ed.\ by J. F. Traub (New York: Academic
Press, 1975), 172--176.]$\bigrp$

\ansno 5. We get the original series back again. This can be
used to test a reversion algorithm.

\ansno 6. $\varphi (x) = x + x\biglp 1 - xV(z)\bigrp $, cf.\
Algorithm 4.3.3R\null. Thus after $W↓0$, $\ldotss$, $W↓{N-1}$ are known,
we input $V↓N$, $\ldotss$, $V↓{2N-1}$, compute $(W↓0 + \cdots +
W↓{N-1}z↑{N-1})(V↓0 + \cdots + V↓{2N-1}z↑{2N-1}) = 1 + R↓0z↑N
+ \cdots + R↓{N-1}z↑{2N-1} + O(z↑{2N})$, and determine $W↓N$,
$\ldotss$, $W↓{2N-1}$ by the formula $W↓N + \cdots + W↓{2N-1}z↑{N-1}
= -(W↓0 + \cdots + W↓{N-1}z↑{N-1})(R↓0 + \cdots + R↓{N-1}z↑{N-1})
+ O(z↑N)$.\xskip [{\sl Numer.\ Math.\ \bf 22} (1974), 341--348; this algorithm
was, in essence, first published by M. Sieveking, {\sl Computing
\bf 10} (1972), 153--156.]\xskip Note that the total time for $N$
coefficients is $O(N \log N)$ if we use ``fast'' polynomial
multiplication (exercise 4.6.4--57).

\ansno 7. $W↓n = {mk\choose k}/n$ when $n = (m - 1)k + 1$, otherwise
0.\xskip(Cf.\ exercise 2.3.4.4--11.)
%folio 842 galley 10a (C) Addison-Wesley 1978	*
\ansno 8. Input $G↓1$ in step L1, and $G↓n$ in step L2. In
step L4, the output should be $(U↓{n-1}G↓1 + 2U↓{n-2}G↓2 + \cdots
+ nU↓0G↓n)/n$.\xskip (The running time of the order $N↑3$ algorithm
is hereby increased by only order $N↑2$. The value $W↓1 = G↓1$
might be output in step L1.)

{\sl Note:} Algorithms T and N determine $V↑{-1}\biglp
U(z)\bigrp$; the algorithm in this exercise determines $G\biglp
V↑{-1}(z)\bigrp $, which is somewhat different. Of course, the
results can all be obtained by a sequence of operations of reversion
and composition (exercise 11), but it is helpful to have more
direct algorithms for each case.

\ansno 9. $\vtop{\halign{$# $⊗$\ctr{#}$\quad⊗$\ctr{#}$\quad⊗$\ctr{#}$\quad
⊗$\ctr{#}$\quad⊗$\ctr{#}$\cr
⊗n = 1⊗n = 2⊗n = 3⊗n = 4⊗n = 5\cr
\noalign{\vskip3pt}
T↓{1n}⊗1⊗1⊗2⊗5⊗14\cr
T↓{2n}⊗⊗1⊗2⊗5⊗14\cr
T↓{3n}⊗⊗⊗1⊗3⊗\99\cr
T↓{4n}⊗⊗⊗⊗1⊗\94\cr
T↓{5n}⊗⊗⊗⊗⊗\91\lower3pt\null\cr}}$

\ansno 10. Form $y↑{1/α} = x(1 + a↓1x + a↓2x↑2 + \cdotss
)↑{1/α} = x(1 + c↓1x + c↓2x↑2 + \cdotss)$ by means of Eq.\ (9);
then revert the latter series.\xskip (See the remarks following Eq.\
1.2.11.3--11.)

\ansno 11. Set $W↓0 ← U↓0$, and set $(T↓k, W↓k) ← (V↓k, 0)$
for $1 ≤ k ≤ N$. Then for $n = 1$, 2, $\ldotss$, $N$, do the following:
Set $W↓j ← W↓j + U↓nT↓j$ for $n ≤ j ≤ N$; and then set $T↓j
← T↓{j-1}V↓1 + \cdots + T↓nV↓{j-n}$ for $j = N$, $N - 1$, $\ldotss
$, $n + 1$.

Here $T(z)$ represents $V(z)↑n$. An {\sl on-line}
power-series algorithm for this problem, analogous to Algorithm
T\null, could be constructed, but it would require about $N↑2/2$
storage locations. There is also an on-line algorithm that
solves this exercise and needs only $O(N)$ storage locations:
We may assume that $V↓1 = 1$, if $U↓k$ is replaced by $U↓kV↑{k}↓{\!1}$
and $V↓k$ is replaced by $V↓k/V↓1$ for all $k$. Then we may
revert $V(z)$ by Algorithm L\null, using its output as input to the
algorithm of exercise 8 with $G↓1 = U↓1$, $G↓2 = U↓2$, etc., thus
computing $U\biglp (V↑{-1})↑{-1}(z)\bigrp - U↓0$.

Brent and Kung have constructed several algorithms
that are asymptotically faster. For example, we can evaluate
$U(x)$ for $x = V(z)$ by a slight variant of exercise 4.6.4--42(c),
doing about $2\sqrt{n}$ chain multiplications of cost $M(n)$
and about $n$ parameter multiplications of cost $n$, where $M(n)$ is the
number of operations needed to multiply power series to order $n$; the total
time is therefore $O\biglp \sqrt{n}M(n) + n↑2\bigrp = O(n↑2)$.
A still faster method can be based on the identity $U\biglp
V↓0(z) + z↑mV↓1(z)\bigrp = U\biglp V↓0(z)\bigrp + z↑mU↑\prime
\biglp V↓0(z)\bigrp V↓1(z) + z↑{2m}U↑{\prime\prime}\biglp V↓0(z)\bigrp
V↓1(z)↑2/2! + \cdotss$, extending to about $n/m$ terms, where
we choose $m \approx \sqrt{n/\!\log n}$; the first term $U\biglp
V↓0(z)\bigrp$ is evaluated in $O\biglp mn(\log n)↑2\bigrp$
operations using a method somewhat like that in exercise 4.6.4--43.
Since we can go from $U↑{(k)}\biglp V↓0(z)\bigrp$
to $U↑{(k+1)}\biglp V↓0(z)\bigrp$ in $O(n \log n)$ operations
by differentiating and dividing by $V↑\prime↓{\!0}(z)$,
the entire procedure takes $O\biglp mn(\log n)↑2 + (n/m)n
\log n) = O(n \log n)↑{3/2}$ operations.\xskip[{\sl JACM}, to appear.]

\ansno 12. Polynomial division is trivial unless $m ≥ n ≥ 1$.
Assuming the latter, the equation $u(x) = q(x)v(x) + r(x)$ is
equivalent to $U(z) = Q(z)V(z) + z↑{m-n+1}R(z)$ where $U(x)
= x↑mu(x↑{-1})$, $V(x) = x↑nv(x↑{-1})$, $Q(x) = x↑{m-n}q(x↑{-1})$, and
$R(x) = x↑{n-1}r(x↑{-1})$ are the ``reverse'' polynomials of
$u$, $v$, $q$, and $r$.

To find $q(x)$ and $r(x)$, compute the first $m
- n + 1$ coefficients of the power series $U(z)/V(z) = W(z)
+ O(z↑{m-n+1})$; then compute the power series $U(z) - V(z)W(z)$,
which has the form $z↑{m-n+1}T(z)$ where $T(z) = T↓0 + T↓1z
+ \cdotss$. Note that $T↓j = 0$ for all $j ≥ n$; hence $Q(z)
= W(z)$ and $R(z) = T(z)$ satisfy the requirements.

\ansno 13. If $U(z)=z+U↓kz↑k+\cdots$ and $V(z)=z↑k+V↓{k+1}z↑{k+1}+\cdotss$, we
find that the difference
$V\biglp U(z)\bigrp-U↑\prime(z)V(z)=\sum↓{j≥1}z↑{2k+j-1}j\biglp
U↓kV↓{k+j}-U↓{k+j}+($polynomial involving only $U↓k$, $\ldotss$, $U↓{k+j-1}$,
$V↓{k+1}$, $\ldotss$, $V↓{k+j-1})\bigrp$; hence $V(z)$ is unique if $U(z)$ is
given and $U(z)$ is unique if $V(z)$ and $U↓k$ are given.

Let's change notation for a moment and consider two new algorithms. The first
one solves the equation $V\biglp z+z↑kU(z)\bigrp=\biglp 1+z↑{k-1}W(z)\bigrp V(z)+
z↑{k-1}S(z)+O(z↑{k-1+n})$ for $V(z)=V↓0+V↓1z+\cdots+V↓{n-1}z↑{n-1}$, given
$U(z)$, $W(z)$, $S(z)$, and $n$. If $n=1$, let $V↓0=-S(0)/W(0)$; or let $V↓0=1$
when $S(0)=W(0)=0$. To go from $n$ to $2n$, let $V\biglp z+z↑kU(z)\bigrp=\biglp
1+z↑{k-1}W(z)\bigrp V(z)+z↑{k-1}S(z)-z↑{k-1+n}R(z)+O(z↑{k-1+2n})$,
$1+z↑{k-1}\A W(z)=\biglp z/(z+z↑kU(z))\bigrp↑n\biglp 1+z↑{k-1}W(z)\bigrp$,
$\A S(z)=\biglp z/(z+z↑kU(z))\bigrp↑nR(z)$, and let $\A V(z)=V↓n+V↓{n+1}z+\cdot
+V↓{2n-1}z↑{n-1}$ satisfy $\A V\biglp z+z↑kU(z)\bigrp=\biglp 1+z↑{k-1}\A W(z)\bigrp
\A V(z)+z↑{k-1}\A S(z)+O(z↑{k-1+n})$.

The second algorithm solves $W(z)U(z)+zU↑\prime(z)=V(z)+O(z↑n)$ for $U(z)=U↓0+
U↓1z+\cdots+U↓{n-1}z↑{n-1}$, given $V(z)$, $W(z)$, and $n$. If $n=1$, let
$U↓0=V(0)/W(0)$, or let $U↓0=1$ in case $V(0)=W(0)=0$. To go from $n$ to $2n$, let
$W(z)U(z)+zU↑\prime(z)=V(z)-z↑nR(z)+O(z↑{2n})$, and let $\A U(z)=U↓n+\cdots+
U↓{2n-1}z↑{n-1}$ be the solution to the equation
$\biglp n+W(z)\bigrp\A U(z)+z{\A U}↑\prime(z)=R(z)
+O(z↑n)$.

Resuming the notation of (27), the first algorithm can be used to solve
$\A V\biglp U(z)\bigrp=U↑\prime(z)\biglp z/U(z)\bigrp↑k\A V(z)$ to any desired
accuracy, and we set $V(z)=z↑k\A V(z)$. To find $P(z)$, suppose we have $V\biglp
P(z)\bigrp=P↑\prime(z)V(z)+O(z↑{2k-1+n})$, an equation that holds for $n=1$
when $P(z)=z+αz↑k$ and $α$ is arbitrary. 
We can go from $n$ to $2n$ by letting $V\biglp P(z)\bigrp
=P↑\prime(z)V(z)+z↑{2k-1+n}R(z)+O(z↑{2k-1+2n})$ and replacing $P(z)$ by $P(z)+
z↑{k+n}\A P(z)$, where the second algorithm is used to find $\A P(z)$ such that
$\biglp k+n-zV↑\prime(z)/V(z)\bigrp\A P(z)+z{\A P}↑\prime(z)=\biglp z↑k/V(z)\bigrp
R(z)+O(z↑n)$.
\vfill\end